Next
Topic:OLAP Functions |

An

aggregate function(also known as asetorcolumnfunction) is an operation applied to the values contained in a particular column (or in a derived column created by an expression), producing a single value that summarizes the (derived) column. Thevalue expressionargument cannot itself contain an aggregate function. As well, aggregate functions cannot contain subqueries.SAND CDBMS supports the following aggregate functions:

`AVG (`

value-expression)

`AVG ( ALL`

value-expression)

Return the average of the non-null values in the specified or derived column (numeric data only).

`AVG ( DISTINCT`

value-expression)

Returns the average of the non-null values in the specified or derived column, when duplicates are omitted (numeric data only).

`CENTMOM2 (`

value-expression)

CENTMOM3 (value-expression)

CENTMOM4 (value-expression)

Return therth moment about the mean, or central moment. That is, CENTMOM2 returns the 2nd moment, CENTMOM3 returns the 3rd moment, and CENTMOM4 returns the 4th moment.The sample central moment is estimated by the following:

where:

nis the sample size

xis the current input value_{k}

mis the mean

ris the moment numberNote that CENTMOM3 results are accurate for input values with up to 17 digits to the left of the decimal place; and CENTMOM4 results are accurate for input values with up to 13 digits to the left of the decimal place.

`COUNT ( * )`

Returns the number of rows in the table or virtual table.

`COUNT (`

value-expression)

COUNT ( ALLvalue-expression)

Return the number of non-null values in the specified or derived column.

`COUNT ( DISTINCT`

column)

Returns the number of non-null values in the specified or derived column, when duplicates are omitted.

`KURTOSIS (`

value-expression)

Returns the degree of peakedness in the distribution given by a specified or derived column (numeric data only). Peakedness is defined here as a normalized form of the fourth central moment of the distribution. Kurtosis is estimated using the following equation:where the

ks arek-statistics.Note that KURTOSIS results are accurate for input values with up to 13 digits to the left of the decimal place.

`MAX (`

value-expression)

Returns the maximum value in the specified column or derived column.

MEANDEV (value-expression)

MEANDEV ( ALLvalue-expression)

Return the mean deviation, or the average of the absolute values of the differences between individual values in the specified or derived column and their mean (numeric data only).

`MEDIAN (`

value-expression)

`MEDIAN ( ALL`

value-expression)

Return the middle value when all the values in the specified or derived column are rank ordered from lowest to highest (numeric data only).

`MIN (`

value-expression)

Returns the minimum value in the specified or derived column.

`ONEOF (`

value-expression)

Returns one random value per group from the specified, non-grouped column.

`SKEWNESS (`

value-expression)

Returns the degree of asymmetry in the distribution given by a specified or derived column (numeric data only). Skewness is estimated using the following equation:where the

ks arek-statistics.Note that SKEWNESS results are accurate for input values with up to 17 digits to the left of the decimal place.

`STDDEV (`

value-expression)

`STDDEV ( ALL`

value-expression)

Return the standard deviation, or the square root of the sum of the squared deviations from their average divided by one less than the number of values in the specified or derived column (numeric data only).

`SUM (`

value-expression)

`SUM ( ALL`

value-expression)

Return the sum of the non-null values in the specified or derived column (numeric data only).

`SUM ( DISTINCT`

value-expression)

Returns the sum of the non-null values in the specified or derived column, when duplicates are omitted (numeric data only).

`SUMCUBE (`

value-expression)

SUMCUBE ( ALLvalue-expression)

Return the sum of the cubes ( Σn^{3}) of the values in a specified or derived column (numeric data only).

`SUMQUAD (`

value-expression)

SUMQUAD ( ALLvalue-expression)

Return the sum of the values to the fourth power ( Σn^{4}) in a specified or derived column (numeric data only).

`SUMSQ (`

value-expression)

`SUMSQ ( ALL`

value-expression)

Return the sum of the squares of the values in a specified or derived column (numeric data only).

`Return the sample variance, or the sum of the squared deviations from their average divided by one less than the number of values in the specified or derived column (numeric data only).`

VARIANCE (value-expression)

VARIANCE ( ALLvalue-expression)

Aggregate functions may only be used in expressions appearing in the projection list of a SELECT statement or in HAVING clause predicates. Aggregate functions cannot be used in WHERE clause expressions, in the value list clause of an INSERT statement, or in the SET clause of an UPDATE.Any direct column references (or expressions that contain column references) that appear in a query projection list along with aggregate functions must also be represented in a GROUP BY clause, either as a direct column reference, an alias, a position number from the projection list, or a full value expression. When a SELECT statement is structured in this way, the aggregate function is applied and produces a separate value for each distinct combination of values from the GROUP BY columns. If the query expression projection list includes one or more aggregate functions but no grouped columns, the whole table is treated as a group.

Consider the following

employeetable and a SELECT statement applied to it:

lnamefnamedeptpayrate Harris Joe 1050 36100.00 Jacobi Ron 1190 25000.00 Mann Susan 1050 39800.00 Prince Peter 1050 31045.00 Desai Anita 1050 46000.00 Gilbert Gilles 1190 21500.00

SELECT dept, COUNT(*), SUM(payrate), AVG(payrate), MAX(payrate), MIN(payrate)

FROM employee

GROUP BY dept;

This SELECT statement returns the distinct values in thedept(department) column in the employee table, along with the number of rows (employees) belonging to each department, and some aggregatepayrateinformation for each department (the sum total of pay rates, the average pay rate, the highest pay rate, and the lowest pay rate):

deptCOUNT(*)SUM(payrate)AVG(payrate)MAX(payrate)MIN(payrate) 1050 4 152945.00 38236.2500000 46000.00 31045.00 1190 2 46500.00 23250.0000000 25000.00 21500.00

Aggregate functions can be part of an arithmetic operation. For example:

MAX(population) - MIN(population)

The next example illustrates the use of the ONEOF() function, which returns one random value per group from a non-grouped column. Given this table:

c1c2 1 a 1 b 2 c 2 d 3 e 3 fThe following query

`SELECT ONEOF(c2) FROM t1 GROUP BY c1;`

will return three values, one from each group (1, 2, 3). From group 1, either "a" or "b" will be returned; from group 2, either "c" or "d" will be returned; and from group 3, either "e" or "f" will be returned. Since the value returned from each group is random, the possible result sets for the above query are:

`{ a, c, e }`

{ b, c, e }

{ a, d, e }

{ b, d, e }

{ a, c, f }

{ b, c, f }

{ a, d, f }

{ b, d, f }

Errors

The following examples demonstrate the usages of aggregate functions that return errors. The first fails because the aggregate function is used in a WHERE clause.

SELECT * FROM State WHERE SUM (Population) = 0;The following statement fails because nested aggregate functions are not permitted.

SELECT SUM (SUM(Population)) FROM State;The following statement fails because the column fname appears in the projection list with the COUNT aggregate function, yet is not included in a GROUP BY clause.

SELECT fname, COUNT(*) FROM white_house;

Next
Topic:OLAP Functions |