DB2 OLAP functions

本文涉及的产品
云原生数据仓库AnalyticDB MySQL版,基础版 8ACU 100GB 1个月
简介:

DB2 OLAP functions(DB2官方文档)

近来在工作中需要用DB2 OLAP函数,在网上找了也没什么资料,有几个也很零散不完整。于
是在IBM DB2网站上找到了这篇官方文档,当时琢磨琢磨,看了下面的例子,试了试,算是搞
懂这几个函数怎么用了。感觉翻译比较麻烦,并且我的E文比较烂,搞不好还会误导读者,因
此,这篇文档我就不翻译了,暂时留着以后查看。在我研究了这篇文档之后,写了篇“DB2
OLAP 函数测试
”,并给出了我测试的过程。希望能节省大家研究的时间,也欢迎和我交流

OLAP-specification:
|--+-|  ranking-specification  |-----+------| +-|  numbering-specification  |---+ '-|  aggregation-specification  |-'  ranking-specification:  |--+-RANK ()-------+--OVER--------------------------------------> '-DENSE_RANK ()-' >--(--+-----------------------------+---------------------------> '-| window-partition-clause |-' >--| window-order-clause |--)-----------------------------------|  numbering-specification:  |--ROW_NUMBER ()--OVER--(--+-----------------------------+------> '-| window-partition-clause |-' >--+-------------------------+--)-------------------------------| '-| window-order-clause |-'  aggregation-specification:  |-- column-function --OVER--(--+-----------------------------+----> '-| window-partition-clause |-' >--+--------------------------------------------------------------------------------+--)--| | .-RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW-. | '-| window-order-clause |--+---------------------------------------------------+-' '-| window-aggregation-group-clause |---------------' window-partition-clause: .-,-----------------------. V | |--PARTITION BY---- partitioning-expression -+--------------------| window-order-clause: .-,--------------------------------------------. V | |--ORDER BY----+- sort-key-expression --+-| asc option |--+-+-+---| | '-| desc option |-' | '-ORDER OF-- table-designator ---------------' asc option: .-ASC-. .-NULLS LAST--. |--+-----+--+-------------+-------------------------------------| '-NULLS FIRST-' desc option: .-NULLS FIRST-. |--DESC--+-------------+----------------------------------------| '-NULLS LAST--' window-aggregation-group-clause: |--+-ROWS--+--+-| group-start |---+-----------------------------| '-RANGE-' +-| group-between |-+ '-| group-end |-----' group-start: |--+-UNBOUNDED PRECEDING----------+-----------------------------| +- unsigned-constant --PRECEDING-+ '-CURRENT ROW------------------' group-between: |--BETWEEN--| group-bound1 |--AND--| group-bound2 |-------------| group-bound1: |--+-UNBOUNDED PRECEDING----------+-----------------------------| +- unsigned-constant --PRECEDING-+ +- unsigned-constant --FOLLOWING-+ '-CURRENT ROW------------------' group-bound2: |--+-UNBOUNDED FOLLOWING----------+-----------------------------| +- unsigned-constant --PRECEDING-+ +- unsigned-constant --FOLLOWING-+ '-CURRENT ROW------------------' group-end: |--+-UNBOUNDED FOLLOWING----------+-----------------------------| '- unsigned-constant --FOLLOWING-'
On-Line Analytical Processing (OLAP) functions provide the ability to return ranking, row numbering and existing column function information as a scalar value in a query result. An OLAP function can be included in expressions in a select-list or the ORDER BY clause of a select-statement (SQLSTATE 42903).  An OLAP function cannot be used within an argument to an XMLQUERY or XMLEXISTS expression (SQLSTATE 42903). An OLAP function cannot be used as an argument of a column function (SQLSTATE 42607). The query result to which the OLAP function is applied is the result table of the innermost subselect that includes the OLAP function.
When specifying an OLAP function, a window is specified that defines the rows over which the function is applied, and in what order. When used with a column function, the applicable rows can be further refined, relative to the current row, as either a range or a number of rows preceding and following the current row. For example, within a partition by month, an average can be calculated over the previous three month period.
The ranking function computes the ordinal rank of a row within the window. Rows that are not distinct with respect to the ordering within their window are assigned the same rank. The results of ranking may be defined with or without gaps in the numbers resulting from duplicate values.
If RANK is specified, the rank of a row is defined as 1 plus the number of rows that strictly precede the row. Thus, if two or more rows are not distinct with respect to the ordering, then there will be one or more gaps in the sequential rank numbering.
If DENSE_RANK (or DENSERANK) is specified, the rank of a row is defined as 1 plus the number of preceding rows that are distinct with respect to the ordering. Therefore, there will be no gaps in the sequential rank numbering.
The ROW_NUMBER (or ROWNUMBER) function computes the sequential row number of the row within the window defined by the ordering, starting with 1 for the first row. If the ORDER BY clause is not specified in the window, the row numbers are assigned to the rows in arbitrary order, as returned by the subselect (not according to any ORDER BY clause in the select-statement).
If the FETCH FIRST n  ROWS ONLY clause is used along with the ROW_NUMBER function, the row numbers might not be displayed in order. The FETCH FIRST clause is applied after the result set (including any ROW_NUMBER assignments) is generated; therefore, if the row number order is not the same as the order of the result set, some assigned numbers might be missing from the sequence.
The data type of the result of RANK, DENSE_RANK or ROW_NUMBER is BIGINT. The result cannot be null.
PARTITION BY (partitioning-expression,...)
Defines the partition within which the function is applied. A partitioning-expression is an expression that is used in defining the partitioning of the result set. Each column-name that is referenced in a partitioning-expression must unambiguously reference a result set column of the OLAP function subselect statement (SQLSTATE 42702 or 42703). A partitioning-expression cannot include a scalar fullselect  or an XMLQUERY or XMLEXISTS expression (SQLSTATE 42822), or any function  or query that is not deterministic or that has an external action (SQLSTATE 42845).
ORDER BY (sort-key-expression,...)
Defines the ordering of rows within a partition that determines the value of the OLAP function or the meaning of the ROW values in the window-aggregation-group-clause (it does not define the ordering of the query result set).
sort-key-expression
An expression used in defining the ordering of the rows within a window partition. Each column name referenced in a sort-key-expression must unambiguously reference a column of the result set of the subselect, including the OLAP function (SQLSTATE 42702 or 42703). A sort-key-expression cannot include a scalar fullselect  or an XMLQUERY or XMLEXISTS expression (SQLSTATE 42822), or any function  or query that is not deterministic or that has an external action (SQLSTATE 42845). This clause is required for the RANK and DENSE_RANK functions (SQLSTATE 42601).
ASC
Uses the values of the sort-key-expression in ascending order.
DESC
Uses the values of the sort-key-expression in descending order.
NULLS FIRST
The window ordering considers null values  before all non-null values in the sort order.
NULLS LAST
The window ordering considers null values  after all non-null values in the sort order.
ORDER OF table-designator
Specifies that the same ordering used in table-designator should be applied to the result table of the subselect. There must be a table reference matching table-designator in the FROM clause of the subselect that specifies this clause (SQLSTATE 42703). The subselect (or fullselect) corresponding to the specified table-designator must include an ORDER BY clause that is dependant on the data (SQLSTATE 428FI). The ordering that is applied is the same as if the columns of the ORDER BY clause in the nested subselect (or fullselect) were included in the outer subselect (or fullselect), and these columns were specified in place of the ORDER OF clause.
window-aggregation-group-clause
The aggregation group of a row R is a set of rows defined in relation to R (in the ordering of the rows of R's partition). This clause specifies the aggregation group. If this clause is not specified, the default is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, providing a cumulative aggregation result.
If window-order-clause is specified, the default behavior is different when window-aggregation-group-clause is not specified. The window aggregation group consists of all rows of the partition of R that precede R or that are peers of R in the window ordering of the window partition defined by the window-order-clause.
ROWS
Indicates the aggregation group is defined by counting rows.
RANGE
Indicates the aggregation group is defined by an offset from a sort key.
group-start
Specifies the starting point for the aggregation group. The aggregation group end is the current row. Specification of the group-start clause is equivalent to a group-between clause of the form "BETWEEN group-start AND CURRENT ROW".
group-between
Specifies the aggregation group start and end based on either ROWS or RANGE.
group-end
Specifies the ending point for the aggregation group. The aggregation group start is the current row. Specification of the group-end clause is equivalent to a group-between clause of the form "BETWEEN CURRENT ROW AND group-end".
UNBOUNDED PRECEDING
Includes the entire partition preceding the current row. This can be specified with either ROWS or RANGE. Also, this can be specified with multiple sort-key-expressions in the window-order-clause.
UNBOUNDED FOLLOWING
Includes the entire partition following the current row. This can be specified with either ROWS or RANGE. Also, this can be specified with multiple sort-key-expressions in the window-order-clause.
CURRENT ROW
Specifies the start or end of the aggregation group based on the current row. If ROWS is specified, the current row is the aggregation group boundary. If RANGE is specified, the aggregation group boundary includes the set of rows with the same values for the sort-key-expressions as the current row. This clause cannot be specified in group-bound2 if group-bound1 specifies value FOLLOWING.
value PRECEDING
Specifies either the range or number of rows preceding the current row. If ROWS is specified, then value is a positive integer indicating a number of rows. If RANGE is specified, then the data type of value must be comparable to the type of the sort-key-expression of the window-order-clause. There can only be one sort-key-expression, and the data type of the sort-key-expression must allow subtraction. This clause cannot be specified in group-bound2 if group-bound1is CURRENT ROW or value FOLLOWING.
value FOLLOWING
Specifies either the range or number of rows following the current row. If ROWS is specified, then value is a positive integer indicating a number of rows. If RANGE is specified, then the data type of value must be comparable to the type of the sort-key-expression of the window-order-clause. There can only be one sort-key-expression, and the data type of the sort-key-expression must allow addition.
Examples
  • Display the ranking of employees, in order by surname, according to their total salary (based on salary plus bonus) that have a total salary more than $30,000.
       SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
           RANK() OVER (ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
         FROM EMPLOYEE WHERE SALARY+BONUS > 30000
         ORDER BY LASTNAME
    Note that if the result is to be ordered by the ranking, then replace ORDER BY LASTNAME with:
       ORDER BY RANK_SALARY
    or
       ORDER BY RANK() OVER (ORDER BY SALARY+BONUS DESC)
  • Rank the departments according to their average total salary.
       SELECT WORKDEPT, AVG(SALARY+BONUS) AS AVG_TOTAL_SALARY,
           RANK() OVER (ORDER BY AVG(SALARY+BONUS) DESC) AS RANK_AVG_SAL
         FROM EMPLOYEE 
         GROUP BY WORKDEPT
         ORDER BY RANK_AVG_SAL
  • Rank the employees within a department according to their education level. Having multiple employees with the same rank in the department should not increase the next ranking value.
       SELECT WORKDEPT, EMPNO, LASTNAME, FIRSTNME, EDLEVEL,
           DENSE_RANK() OVER
             (PARTITION BY WORKDEPT ORDER BY EDLEVEL DESC) AS RANK_EDLEVEL
         FROM EMPLOYEE
         ORDER BY WORKDEPT, LASTNAME
  • Provide row numbers in the result of a query.
       SELECT ROW_NUMBER() OVER (ORDER BY WORKDEPT, LASTNAME) AS NUMBER,
           LASTNAME, SALARY
         FROM EMPLOYEE
         ORDER BY WORKDEPT, LASTNAME
  • List the top five wage earners.
       SELECT EMPNO, LASTNAME, FIRSTNME, TOTAL_SALARY, RANK_SALARY 
         FROM (SELECT EMPNO, LASTNAME, FIRSTNME, SALARY+BONUS AS TOTAL_SALARY,
           RANK() OVER (ORDER BY SALARY+BONUS DESC) AS RANK_SALARY
           FROM EMPLOYEE) AS RANKED_EMPLOYEE 
         WHERE RANK_SALARY < 6 
         ORDER BY RANK_SALARY
    Note that a nested table expression was used to first compute the result, including the rankings, before the rank could be used in the WHERE clause. A common table expression could also have been used.

本文转自 leizhimin 51CTO博客,原文链接:http://blog.51cto.com/lavasoft/37194,如需转载请自行联系原作者
相关实践学习
AnalyticDB MySQL海量数据秒级分析体验
快速上手AnalyticDB MySQL,玩转SQL开发等功能!本教程介绍如何在AnalyticDB MySQL中,一键加载内置数据集,并基于自动生成的查询脚本,运行复杂查询语句,秒级生成查询结果。
阿里云云原生数据仓库AnalyticDB MySQL版 使用教程
云原生数据仓库AnalyticDB MySQL版是一种支持高并发低延时查询的新一代云原生数据仓库,高度兼容MySQL协议以及SQL:92、SQL:99、SQL:2003标准,可以对海量数据进行即时的多维分析透视和业务探索,快速构建企业云上数据仓库。 了解产品 https://www.aliyun.com/product/ApsaraDB/ads
相关文章
|
6月前
|
人工智能 关系型数据库 数据库
PostgreSQL 常见问题解决方案 - ERROR: database is being accessed by other users
PostgreSQL 常见问题解决方案 - ERROR: database is being accessed by other users
|
NoSQL 数据库 索引
|
SQL 存储 算法
CockroachDB: The Resilient Geo-Distributed SQL Database
一直以来对CockroachDB(CRDB for short)的设计和实现很感兴趣,最近抽时间研究了下,发现其在技术上还是领先了同类NewSQL产品不少的,个人感觉应该是目前最为先进的类Spanner分布式数据库系统,因此这篇文章会尽可能详细的讨论下其系统的多个方面,重点是事务和一致性相关。 paper中针对的是v.19.2.2版本,不过官方文档中是基于最新的v.21.1.7,两者在描述上有一些冲突的地方,而官方文档中会更为详尽些,因此本文的很多介绍将尽量将paper与官方reference结合,并以reference为准。
543 0
CockroachDB: The Resilient Geo-Distributed SQL Database
|
关系型数据库 PostgreSQL
|
关系型数据库 MySQL 数据库