OVER clause for OLAP window expressions
The OVER clause defines the result set on which an OLAP window expression is performed.
Syntax
Element | Description | Restrictions | Syntax |
---|---|---|---|
offset | Unsigned integer that represents the offset from the position of the current row | Cannot be negative. If zero, specifies the current row | Literal integer. |
partition_key | Column name, alias, or constant expression by which to partition rows | Must be in the select list of the Projection clause | Column Expressions |
sorting_key | Column name, alias, or constant expression by which to sort rows | Same restrictions as for partition_key. For RANGE window frames, only a single sorting key is allowed, and the data type must be numeric, DATE, or DATETIME. | Column Expressions |
If the OVER clause is empty, you must still include the empty parentheses.
Window PARTITION clause
An OLAP window partition is a subset of the rows that are returned by a query. Each partition is defined by one or more column expressions in the PARTITION BY specification of the OVER clause that defines the window. The database server applies the specified OLAP window function to all of the rows in each window partition. If no partitions are defined in the OVER clause, the window function is applied to every row in the result set of the query.
Window ORDER clause
The database server sorts the rows in each window partition according to the sort key (or multiple sort keys) in the window ORDER clause. If you specify no ascending (ASC) or descending (DESC) order, ASC is the default. If no ORDER clause is specified, the order of the qualifying rows is the order in which the rows were retrieved.
Window Frame clause
The window Frame clause can return subsets, called aggregation groups, of the rows in each window partition. A window frame is defined by a specific number of rows or by a range of values.
- Row-based window frame
- The
ROWS keyword creates a row-based window frame that consists of a specific
number of rows that precede or follow the current row, or both. The
offset represents the number of rows to return. The following example
returns seven rows that include the six rows that precede the current
row:
AVG(price) OVER (ORDER BY year, day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
- Value-based window frame
- The RANGE keyword creates a value-based frame clause that consists
of the current row plus the rows that meet the criteria that is set
by the sorting key in the ORDER clause and fit into the specified
offset. The offset represents the number of units of the data type
of the sorting key. The sorting key must be a numeric, DATE, or DATETIME
data type. For example, if the sorting key is a DATE data type, the
offset represents a specific number of days. The following example
returns the count of the number of rows that have a ship date within
2 days of the current row plus the current row:
COUNT(*) OVER (ORDER BY ship_date RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING)
In both the row-based and value-based cases, the OLAP function is calculated on the contents of this window frame, rather than the fixed contents of the whole partition. The window frame does not need to contain the current row. For example, the following specification defines a window frame that contains only the row before the current row:
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Example: SUM function without a window frame
The following query returns the sales by quarter for one year and the cumulative sum of the sales by quarter.
SELECT sales, SUM(sales) OVER (ORDER BY quarter) FROM sales WHERE year = 2013 sales (sum) 120 120 135 255 127 382 153 535
The sum of the sales for the fourth quarter is equal to the sales in all four quarters.
Because the query does not include a window frame clause, the SUM function operates on the entire result set, as specified by the FROM clause.
Example: Row-based window frame
The following query returns players that are partitioned by teams and ordered by the number of points. Within each partition, the number of points for the player and the previous player are averaged:
SELECT team, player, points, AVG(points) OVER(PARTITION BY team ORDER BY points ROWS 1 PRECEDING AND CURRENT ROW) AS olap_avg FROM points; TEAM PLAYER POINTS OLAP_AVG A Singh 7 7.00000000000 A Smith 14 10.50000000000 B Osaka 8 8.00000000000 B Ricci 12 10.00000000000 B Baxter 18 15.00000000000 C Chun 13 13.00000000000 D Kwan 9 9.00000000000 D Tran 16 12.50000000000
Example: Range-based window frame
The following query returns players that are partitioned by teams and ordered by age. Within each partition, the number of points for each player and any player who is up to 9 years older is averaged:
SELECT player, age, team, points, AVG(points) OVER(PARTITION BY team ORDER BY age RANGE BETWEEN CURRENT ROW AND 9 FOLLOWING) AS olap_avg FROM points_age; PLAYER AGE TEAM POINTS OLAP_AVG Singh 25 A 7 10.50000000000 Smith 26 A 14 14.00000000000 Baxter 27 B 18 13.00000000000 Osaka 35 B 8 10.00000000000 Ricci 40 B 12 12.00000000000 Chun 21 C 13 13.00000000000 Kwan 22 D 9 12.50000000000 Tran 31 D 16 16.00000000000
In partition A, the average for Singh includes the points for Smith, because Smith is one year older than Singh. The average for Smith does not include the points from Singh, because Singh is younger than Smith.
In partition B, the average for Baxter includes the points for Osaka, who is 8 years older than Baxter, but not for Ricci, who is 13 years older than Baxter.
In partition D, the average for Kwan includes the points for Tran, because Tran is 9 years older than Kwan.
Example: Window frame without the current row
The following query calculates the average number of points for the preceding two rows in the partition:
SELECT player, age, team, points, AVG(points) OVER(PARTITION BY team ORDER BY age ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) AS olap_avg FROM points_age; PLAYER AGE TEAM POINTS OLAP_AVG Singh 25 A 7 NULL Smith 26 A 14 7.00000000000 Baxter 27 B 18 NULL Osaka 35 B 8 18.00000000000 Ricci 40 B 12 13.00000000000 Chun 21 C 13 NULL Kwan 22 D 9 NULL Tran 31 D 16 9.00000000000
In partition B, the average for Ricci is based on the points totals for Baxter and Osaka: (18 + 8 = 26)/2 = 13. When the current row has no preceding rows to use for the calculation, the result is NULL.