OLAP window aggregate functions
Several functions that return aggregate results, such as sums and averages, from the results of a query can also be used as OLAP functions from the context of an OLAP window.
Syntax
Element | Description | Restrictions | Syntax |
---|---|---|---|
column_expr | Column expression argument to the aggregate function | See the headings for individual functions below | Identifier |
alias, synonym, table, view | Synonym, table, view, or alias that contains column | Synonym and the table or view to which it points must exist | Identifier |
Usage
These aggregate functions do not require an OLAP window to calculate aggregate values from a query result set. They behave like OLAP window aggregation functions, however, in calling contexts where the OVER clause in the function expression defines one or more window partitions, or includes the window ORDER clause and the window Frame clause.
The following aggregate functions can return information about the rows in OLAP window partitions.
AVG function
The AVG function returns the average of all values in the specified column or expression in a window partition of the query results, for the rows in each partition that is defined in the OVER clause. If the OVER clause includes the Window Frame clause, AVG returns a value for each set of rows in a window frame.
You can apply the AVG function only to columns of number data types. If you use the DISTINCT (or UNIQUE) keyword, the average (meaning the mean) is calculated from only the distinct values in the specified column or expression, and the OVER clause cannot include a Window ORDER or Window Frame clause.
NULL values are ignored unless every value in the column or expression is NULL. If every value is NULL, the AVG function returns a NULL for that column or expression.
You cannot use the AVG function with a nonnumeric column or expression.
COUNT function
The COUNT function returns the cardinality of non-NULL values in the specified column or expression in a window partition of the query results, for each partition that is defined in the OVER clause. If the OVER clause includes the Window Frame clause, COUNT returns a value for each set of rows in a window frame.
If you use the DISTINCT (or UNIQUE) keyword, the cardinality of rows in the partition is calculated from only the distinct values in the specified column or expression, and the OVER clause cannot include a Window ORDER or Window Frame clause.
NULL values are ignored unless every value in the column or expression is NULL. If every value is NULL, the COUNT function returns a NULL for that column or expression.
MAX function
The MAX function returns the largest value in the column or expression in a window partition of the query results, for the rows in each partition that is defined in the OLAP window OVER clause.
Specifying the DISTINCT or UNIQUE keyword has no effect on the results, but (as with the other built-in aggregate functions) no Window ORDER or Window Frame clause is allowed.
If the OVER clause includes the Window Frame clause, MAX returns a value for each set of rows in a window frame.
When a column expression is specified as argument to COUNT, NULL values are ignored unless every value in the specified column expression is NULL. If every value is NULL, MAX returns a NULL value for that column or expression. When COUNT(*) is specified, NULL values are counted the same as other values.
MIN function
The MIN function returns the lowest value in the column or expression in a window partition of the query results, for each partition that the OLAP window OVER clause defines. If the OVER clause includes the Window Frame clause, the MIN function returns a value for each set of rows.
Specifying the DISTINCT or UNIQUE keyword has no effect on the results, but (as with the other built-in aggregate functions) no Window ORDER or window Frame clause is allowed.
If the OVER clause includes the Window Frame clause, MIN returns a value for each set of rows in a window frame.
NULL values are ignored unless every value in the specified column expression is NULL. If every value is NULL, MIN returns a NULL value for that column expression.
RANGE function
The RANGE function returns the range of values in the column or expression in a window partition of the query results, for each partition that the OLAP window OVER clause defines. If the OVER clause includes the Window Frame clause, RANGE returns a value for each set of rows in a window frame.
range(expr) = max(expr) - min(expr)You can apply the RANGE function only to numeric column expressions. The following query finds the range of ages for a population:
SELECT RANGE(age) OVER () FROM u_pop;Because DATE values are stored internally as integers, you can use the RANGE function on DATE column. With a DATE column, the return value is the number of days between the earliest and latest dates in the column expression.
NULL values are ignored unless every value in the column expression is NULL. If every column expression value is NULL, the RANGE function returns a NULL for that column expression.
STDEV function
SQRT((SUM(Xi2) - (SUM(Xi)2)/N)/(N - 1))In this formula, Xi is each column expression value in the window partition or frame that the OVER clause specifies, and N is the total number of non-NULL values in the column expression.
If the OVER clause includes the Window Frame clause, the STDEV function returns a value for each set of rows in a window frame.
NULL values are ignored, unless every value in the specified column expression is NULL. If every column expression value is NULL, the STDEV function returns NULL for that column expression.
You can apply the STDEV function only to numeric column expressions. You cannot use this function on column expressions of type DATE.
SUM function
The SUM function calculates and returns the sum of all the values of a column expression in a window partition of the query results, for the rows in each partition that is defined in the OLAP window OVER clause.
If you use the DISTINCT (or UNIQUE) keyword as the first item in the argument list, the sum is for only distinct values in the column or expression, and no Window ORDER or Window Frame clause is allowed.
NULL values are ignored, unless every value is NULL. If every value is NULL, the SUM function returns the NULL value for that column or expression.
You cannot use the SUM function with a nonnumeric column or expression.
VARIANCE function
The VARIANCE function calculates and returns the mean squared error as an estimate of the population variance of the values in a specified numeric column or expression for the partitions of query results that are defined in the OVER clause that follows the OLAP window VARIANCE expression.
(SUM(Xi2) - (SUM(Xi)2)/N)/(N - 1)In this formula,
- Xi is each column value in the window partition or frame that the OVER clause specifies,
- and N is the total number of non-NULL values in the column (unless all values are NULL, in which case the variance is logically undefined, and the VARIANCE function returns NULL).
Example: AVG function with partitioning
ABC
and
on XYZ
values in the symbol column as the
partitioning keys.SELECT symbol, tradingdate, AVG(closeprice) OVER (PARTITION BY symbol ORDER BY tradingdate ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) FROM dailystockdata WHERE symbol IN ('ABC', 'XYZ') AND tradingdate BETWEEN '2012-01-01' AND '2012-12-31';The window ORDER clause specifies the tradingdate column value as the sorting key, and the window Frame clause defines a rolling window that is based on the average of 30 consecutive tradingdate values, ending with the tradingdate of the current row.
Example: AVG function without partitioning
The following query returns the stock price ordered by day and the average of the prices of the current day, the day before, and the day after. The result set is not partitioned because the query does not include a PARTITION BY clause.
SELECT price, AVG(price) OVER (ORDER BY tradingday ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM stock_price WHERE tradingday BETWEEN '2012-11-01' AND '2012-11-07'; price (avg) 18.25 18.31 18.37 18.31 18.37 19.03 19.03 18.81 18.59 18.61 18.21 18.40
The first value in the avg column is the average of the first two values in the price column, because no preceding values exist for the first value of the price column.
The second value in the avg column is the average of the first two values in the price column, because the third row of the price column does not have a value.
The third value in the avg column is equal to the second value in the price column because the third and fourth rows in the price column do not have values.
Example: COUNT function
The following query returns the shipping date, the shipping charge, and the number of orders for each order by customer. The query results are partitioned by customer number and limited to customer numbers that are less than or equal to 110.
SELECT customer_num, ship_date, ship_charge, COUNT(*) OVER (PARTITION BY customer_num) FROM orders WHERE customer_num <= 110; customer_num ship_date ship_charge (count(*)) 101 05/26/2008 $15.30 1 104 05/23/2008 $10.80 4 104 07/03/2008 $5.00 4 104 06/01/2008 $10.00 4 104 07/10/2008 $12.20 4 106 05/30/2008 $19.20 2 106 07/03/2008 $12.30 2 110 07/06/2008 $13.80 2 110 07/16/2008 $6.30 2
Customer 104 appears in the list four times. The value in the count column is always 4 for customer 104.