ClickHouse/docs/en/sql-reference/aggregate-functions/reference/varpop.md
2024-03-14 15:04:19 +01:00

2.4 KiB

title slug sidebar_position
varPop /en/sql-reference/aggregate-functions/reference/varpop 32

This page covers the varPop and varPopStable functions available in ClickHouse.

varPop

Calculates the population covariance between two data columns. The population covariance measures the degree to which two variables vary together. Calculates the amount Σ((x - x̅)^2) / n, where n is the sample size and is the average value of x.

Syntax

covarPop(x, y)

Parameters

  • x: The first data column. Numeric
  • y: The second data column. Numeric

Returned value

Returns an integer of type Float64.

Implementation details

This function uses a numerically unstable algorithm. If you need numerical stability in calculations, use the slower but more stable varPopStable function.

Example

DROP TABLE IF EXISTS test_data;
CREATE TABLE test_data
(
    x Int32,
    y Int32
)
ENGINE = Memory;

INSERT INTO test_data VALUES (1, 2), (2, 3), (3, 5), (4, 6), (5, 8);

SELECT
    covarPop(x, y) AS covar_pop
FROM test_data;
3

varPopStable

Calculates population covariance between two data columns using a stable, numerically accurate method to calculate the variance. This function is designed to provide reliable results even with large datasets or values that might cause numerical instability in other implementations.

Syntax

covarPopStable(x, y)

Parameters

Returned value

Returns an integer of type Float64.

Implementation details

Unlike varPop(), this function uses a stable, numerically accurate algorithm to calculate the population variance to avoid issues like catastrophic cancellation or loss of precision. This function also handles NaN and Inf values correctly, excluding them from calculations.

Example

Query:

DROP TABLE IF EXISTS test_data;
CREATE TABLE test_data
(
    x Int32,
    y Int32
)
ENGINE = Memory;

INSERT INTO test_data VALUES (1, 2), (2, 9), (9, 5), (4, 6), (5, 8);

SELECT
    covarPopStable(x, y) AS covar_pop_stable
FROM test_data;
0.5999999999999999