ColumnStore Distributed Aggregate Functions

MariaDB ColumnStore supports the following aggregate functions, these can be used in the SELECT, HAVING, and ORDER BY clauses of the SQL statement.

Function
Description

Function

Description

AVG([DISTINCT] column)

Average value of a numeric (INT variations, NUMERIC, DECIMAL) column

CORR(ColumnY, ColumnX)

The correlation coefficient for non-null pairs in a group.

COUNT (*, [DISTINCT] column)

The number of rows returned by a query or grouping. All datatypes are supported

COVAR_POP(ColumnY, ColumnX)

The population covariance for non-null pairs in a group.

COVAR_SAMP(ColumnY, ColumnX)

The sample covariance for non-null pairs in a group.

MAX ([DISTINCT] column)

The maximum value of a column. All datatypes are supported.

MIN ([DISTINCT] column)

The maximum value of a column. All datatypes are supported.

REGR_AVGX(ColumnY, ColumnX)

Average of the independent variable (sum(ColumnX)/N), where N is number of rows processed by the query

REGR_AVGY(ColumnY, ColumnX)

Average of the dependent variable (sum(ColumnY)/N), where N is number of rows processed by the query

REGR_COUNT(ColumnY, ColumnX)

The total number of input rows in which both column Y and column X are nonnull

REGR_INTERCEPT(ColumnY, ColumnX)

The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

REGR_R2(ColumnY, ColumnX)

Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model

REGR_SLOPE(ColumnY, ColumnX)

The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs

REGR_SXX(ColumnY, ColumnX)

REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.

REGR_SXY(ColumnY, ColumnX)

REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.

REGR_SYY(ColumnY, ColumnX)

REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.

STD(), STDDEV(), STDDEV_POP()

The population standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column

STDDEV_SAMP()

The sample standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column

SUM([DISTINCT] column)

The sum of a numeric (INT variations, NUMERIC, DECIMAL) column

VARIANCE(), VAR_POP()

The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column

VAR_SAMP()

The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column

Note

  • Regression functions (REGR_AVGX to REGR_YY), CORR, COVAR_POP and COVAR_SAMP are supported for version 1.2.0 and higher

Example

An example group by query using aggregate functions is:

select year(o_orderdate) order_year, 
avg(o_totalprice) avg_totalprice, 
max(o_totalprice) max_totalprice, 
count(*) order_count 
from orders 
group by order_year 
order by order_year;

This page is licensed: CC BY-SA / Gnu FDL

Last updated

Was this helpful?