Over 窗口函数在Select 子句中,对查询的结果集进行“滑动-聚合”运算;如果使用count,那么基于滑动窗口的聚合语义同 base+1 累加;如果使用sum,那么基于滑动窗口的聚合语义等同于数据累加。Over()子句的运算顺序在Select 子句之后,在Order By子句之前。
滑动窗口计算原理:窗口的大小是由Over 的Partition By子句界定,窗口滑动的顺序是由Over的Order by子句指定。在计算聚合值时,使用<=(Order by Asc)或 >=(Order by Desc)计算滑动窗口的聚合值,某一个窗口的计算逻辑类似于
select t.*, AggregationFuncation() Over(partition by t.ColumnName_Partition order by t.ColumnName_Order asc) from dbo.TableName t select AggregationFuncation from dbo.TableName t where t.ColumnName_Partition=Over_PartitionColumnValue and t.ColumnName_Order<=Over_OrderColumnValue
一,分区
1,创建示例数据
--create table create table dbo.dt_test ( ID int, Code int ) go --insert data insert into dbo.dt_test(ID,Code) values(3,1),(3,2),(1,1),(1,2),(2,3),(1,2) go
2,测试Over 创建函数的 滑动-聚合
--test over select ID,Code, count(0) over(partition by Code order by ID) as Count_Over, sum(ID) over(partition by Code order by ID) as Sum_Over from dbo.dt_test
查询的结果按照 Code 排序,在Over 函数中,按照Code分区,按照ID排序。
分析查询结果:
2.1,当Code=1时,在这个分区(窗口)中,有两行,ID分别是1,3
当Code=1,ID=1时,是分区的第一行,Count_Over=1,Sum_Over=1,这是滑动窗口的第一行,聚合值是的计算逻辑是
select count(0) as Count_Over, sum(ID) as Sum_Over from dbo.dt_test where Code=1 and ID<=1
当Code=1,ID=3时,是分区的第二行,Count_Over=2,Sum_Over=4,这是滑动窗口的第二行,聚合值是的计算逻辑是
select count(0) as Count_Over,
sum(ID) as Sum_Over from dbo.dt_test where Code=1 and ID<=3
2.2 当Code=2时,在这个分区中,有三种,ID分别是1,1,3
当Code=2,ID=1时,是分区的第一行,Count_Over=2,Sum_Over=2,这是滑动窗口的第一行,聚合值是的计算逻辑是
select count(0) as Count_Over,
sum(ID) as Sum_Over from dbo.dt_test where Code=2 and ID<=1
当Code=2,ID=1时,是分区的第二行,Count_Over=2,Sum_Over=2,这是滑动窗口的第二行,聚合值是的计算逻辑是
select count(0) as Count_Over,
sum(ID) as Sum_Over from dbo.dt_test where Code=2 and ID<=1
当Code=2,ID=3时,是分区的第三行,Count_Over=3,Sum_Over=5,这是滑动窗口的第三行,聚合值是的计算逻辑是
select count(0) as Count_Over,
sum(ID) as Sum_Over from dbo.dt_test where Code=2 and ID<=3
3,Over 函数用于分组聚合函数
如果 Over 函数用于分区聚合函数,partition by子句是必需的,如果省略 Order by 子句,那么将对整个分区(窗口)进行聚合计算
select ID,Code, count(0) over(partition by Code) as Count_Over, sum(ID) over(partition by Code) as Sum_Over from dbo.dt_test
查询的结果等价于(传统写法)
select ID,Code, (select count(0) from dbo.dt_test as ij where ij.Code=oj.Code) as Count_Over, (select sum(id) from dbo.dt_test as ij where ij.Code=oj.Code) as Sum_Over from dbo.dt_test oj
相比传统写法,Over函数写法查询性能更好。
4,Over 函数用于排名函数
Over 函数用于排名函数时,Order by子句是必需的,如果省略partition by 子句,那么整个查询结果集是一个窗口,排名函数对整个窗口排名。
select ID,Code, row_number() over(order by Code) as RowID from dbo.dt_test
如果使用partition by子句,那么排名函数在每一个分区中进行排名,每一行的排名是基于当前的分区。
select ID,Code, row_number() over(partition by code order by ID) as RowID from dbo.dt_test
二,限制分区的数据行
在Over子句中,使用Rows 或Range 进一步限制分区的数据行,在使用时,必须注意:
- 必需条件:Rows 和 Range必须跟在Order by 子句之后,对排序的结果进行限制;
- Rows:使用固定的行数来限制分区中的数据行数量;The ROWS clause limits the rows within a partition by specifying a fixed number of rows preceding or following the current row.
- Range:使用Value的范围来限制分区中的数据行数量,排序列的重复值,被认为是一个值;The RANGE clause logically limits the rows within a partition by specifying a range of values with respect to the value in the current row.
- 在分区中,如果排序行不存在重复值,Rows和Range返回的结果是相同的;如果排序行存储在重复值,Rows和Range返回的结果可能不同;
- Range子句只能从分区的开始或结尾到当前行开始计算,不能使用 <UINT_Number> PRECEDING 和<UINT_Number> FOLLOWING;
使用在Rows 和 Range子句中的特殊关键字:
- UNBOUNDED PRECEDING:指定分区的第一行
- UNBOUNDED FOLLOWING:指定分区的最后一行
- CURRENT ROW:指定分区的当前数据行
- <UINT_Number> PRECEDING:在分区中,指定当前行之前的数据行数量,UINT_Number是>=0的整数
- <UINT_Number> FOLLOWING:在分区中,指定当前行之后的数据行数量,UINT_Number是>=0的整数
1,对分区中的连续两行计算加和
计算逻辑是:在当前分区中,对当前行和其之后的1行数据计算加和;
select ID,Code,sum(code) over(partition by ID order by Code rows between current row and 1 following) as SumCode from dbo.dt_test
2,对分区中的数据行,按照特定顺序,计算从第一行到当前行的累加值
计算逻辑是:从第一行到当前行,计算累加值
select ID,Code, sum(code) over(partition by ID order by Code rows unbounded preceding) as SumCode_Rows, sum(code) over(partition by ID order by Code range unbounded preceding) as SumCode_Range from dbo.dt_test
参考文档:
SQL Server Window Function 窗体函数读书笔记二 - A Detailed Look at Window Functions