一、Mysql8.0 窗口函数
1、窗口函数与聚合函数的区别
MySQL从8.0开始支持窗口函数,窗口函数也叫分析函数。窗口函数更多的是业务中需要对数据做排序/分组排序,环比计算,百分比计算等需求。与聚合函数明显的不同是,聚合函数通过会将所有记录进行分类聚合;而窗口函数是对所有数据记录按照指定窗口进行计算,不会进行聚合。
在MySQL8.0之前我们其实也可以通过一定的办法实现窗口函数的功能,具体如何实现请见下文。
2、窗口函数基本语法
窗口函数主要由窗口函数、over子句、window子句、partition子句、order子句、frame子句。
示例:
SELECT time,subject,val,
FIRST_VALUE(val) OVER w AS 'first', //窗口函数以及over子句
LAST_VALUE(val) OVER w AS 'last',
NTH_VALUE(val, 2) OVER w AS 'second',
NTH_VALUE(val, 4) OVER w AS 'fourth'
FROM observations
WINDOW w AS ( //window子句
PARTITION BY subject //partition子句
ORDER BY time //order子句
ROWS UNBOUNDED PRECEDING //frame子句
);
+----------+---------+------+-------+------+--------+--------+
| time | subject | val | first | last | second | fourth |
+----------+---------+------+-------+------+--------+--------+
| 07:00:00 | st113 | 10 | 10 | 10 | NULL | NULL |
| 07:15:00 | st113 | 9 | 10 | 9 | 9 | NULL |
| 07:30:00 | st113 | 25 | 10 | 25 | 9 | NULL |
| 07:45:00 | st113 | 20 | 10 | 20 | 9 | 20 |
| 07:00:00 | xh458 | 0 | 0 | 0 | NULL | NULL |
| 07:15:00 | xh458 | 10 | 0 | 10 | 10 | NULL |
| 07:30:00 | xh458 | 5 | 0 | 5 | 10 | NULL |
| 07:45:00 | xh458 | 30 | 0 | 30 | 10 | 30 |
| 08:00:00 | xh458 | 25 | 0 | 25 | 10 | 30 |
+----------+---------+------+-------+------+--------+--------+
- 窗口函数
按照具体业务需求指定具体的窗口函数,常见窗口函数如下:
函数 | 函数意义 |
---|---|
row_number() | 排序函数,并列记录也按照+1进行正常排序 |
rank() | row 2 排序函数,并列记录同序号,并列记录后的记录按照总记录+1进行正常排序 |
dense_rank() | 排序函数,并列记录同序号,并列记录后的记录按照并列记录+1进行正常排序 |
percent_rank() | 分布函数,计算当前记录排序的(rank() - 1) 除以 (分组总记录数 - 1)的百分比 |
cume_dist() | 分布函数,计算当前记录排序rank()的总记录数除以分组总记录数的百分比 |
lag(col,N) | 按照分组排序,显示该记录的前N个的col值 |
lead(col,N) | 按照分组排序,显示该记录的后N个的col值 |
first_value(col) | 按照分组排序,显示排序第一的col值 |
last_value(col) | 按照分组排序,显示截止到该记录排序最后的col值 |
nth_value(col,N) | 按照分组排序下,截止到当前记录第N排序的col值 |
nfile(N) | 按照分组排序,将所有记录分为N份 |
- over子句
关键字,over() w as col_name中,w为别名,指定了具体的分组排序规则,确定窗口函数的计算范围,也可以将partion子句、order子句写在over()中直接使用。
over(partition by aa order by bb) w as col_name
<=>
over() w as col_name
window w as (partition by aa order by bb)
- window子句
指定了具体的分组排序规则,确定窗口函数的计算范围,若SQL中需要多个窗口时可使用不同的别名来区分
- partition子句
window w as (partition by aa) //按照aa进行分组,窗口函数每个分组单独计算
指定分组规则,若不需要可不写
- order子句
指定排序规则,可以单独使用,也可以配合partition使用
window w as (partition by aa order by bb) //按照aa进行分组,每组记录按照bb进行顺序排序
window w as (order by bb desc) //符合where条件的全部记录按照bb倒叙排序
- frame子句
指定窗口函数的计算范围
语法:ROWS + 边界关键字
CURRENT ROW 以当前记录为左/右侧边界
UNBOUNDED PRECEDING 以分组中的第一条记录为左侧边界
UNBOUNDED FOLLOWING 以分组中的最后一条记录为右侧边界
N PRECEDING 以当前记录的前N条记录为左侧边界
N FOLLOWING 以当前记录的后N条记录为右侧边界
示例:
window w as (rows between N preceding and M following) // 窗口范围是当前记录的前N条记录到后M条记录
window w as (rows unbounded preceding) //窗口范围是当前记录到分组中的最后一条记录。
window w as (rows between unbounded preceding and unbounded following) //默认该模式,窗口范围是当前分组中所有记录范围进行计算
二、MySQL8.0之前实现窗口函数的办法
在MySQL8.0之前时没有窗口函数的,MySQL也可以实现部分窗口函数的功能,以下是部分示例:
1、 row_number()实现
MySQL8.0
无分组排序:
root@mysql 22:27: [test]> select row_number() over w as row_num,class_num,user_id,score from score where subject_name='Math' window w as ( order by score desc) ;
+---------+-----------+---------+--------+
| row_num | class_num | user_id | score |
+---------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 3 | 002 | 8 | 98.00 |
| 4 | 001 | 3 | 92.00 |
| 5 | 001 | 2 | 89.00 |
| 6 | 002 | 9 | 89.00 |
| 7 | 001 | 10 | 89.00 |
| 8 | 002 | 4 | 87.00 |
| 9 | 002 | 6 | 77.00 |
| 10 | 001 | 7 | 73.00 |
+---------+-----------+---------+--------+
10 rows in set (0.00 sec)
分组排序:
root@mysql 22:50: [test]> select row_number() over w as row_num,class_num,user_id,score from score where subject_name='Math' window w as ( partition by class_num order by score desc) ;
+---------+-----------+---------+--------+
| row_num | class_num | user_id | score |
+---------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 001 | 3 | 92.00 |
| 3 | 001 | 2 | 89.00 |
| 4 | 001 | 10 | 89.00 |
| 5 | 001 | 7 | 73.00 |
| 1 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 3 | 002 | 9 | 89.00 |
| 4 | 002 | 4 | 87.00 |
| 5 | 002 | 6 | 77.00 |
+---------+-----------+---------+--------+
10 rows in set (0.00 sec)
MySQL之前
无分组排序:
root@mysql 22:27: [test]> select @r:=@r+1 as `row_number`,class_num,user_id,score from score,(select @r:=0) temp order by score desc;
+------------+-----------+---------+--------+
| row_number | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 3 | 002 | 8 | 98.00 |
| 4 | 001 | 3 | 92.00 |
| 5 | 001 | 2 | 89.00 |
| 6 | 002 | 9 | 89.00 |
| 7 | 001 | 10 | 89.00 |
| 8 | 002 | 4 | 87.00 |
| 9 | 002 | 6 | 77.00 |
| 10 | 001 | 7 | 73.00 |
+------------+-----------+---------+--------+
10 rows in set, 2 warnings (0.00 sec)
分组排序:
root@mysql 22:48: [test]> select `row_number`,class_num,user_id,score from (select if(@class=class_num,@r:=@r+1,@r:=1) as `row_number`,@class:=class_num,class_num,user_id,score from score,(select @r:=0,@class=NULL) temp order by class_num,score desc) tmp2;
+------------+-----------+---------+--------+
| row_number | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 001 | 3 | 92.00 |
| 3 | 001 | 2 | 89.00 |
| 4 | 001 | 10 | 89.00 |
| 5 | 001 | 7 | 73.00 |
| 1 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 3 | 002 | 9 | 89.00 |
| 4 | 002 | 4 | 87.00 |
| 5 | 002 | 6 | 77.00 |
+------------+-----------+---------+--------+
10 rows in set, 4 warnings (0.00 sec)
2、dense_rank()实现
MySQL8.0
无分组排序:
root@mysql 22:27: [test]> select dense_rank() over w as `dense_rank`,class_num,user_id,score from score where subject_name='Math' window w as ( order by score desc) ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 3 | 001 | 3 | 92.00 |
| 4 | 001 | 2 | 89.00 |
| 4 | 002 | 9 | 89.00 |
| 4 | 001 | 10 | 89.00 |
| 5 | 002 | 4 | 87.00 |
| 6 | 002 | 6 | 77.00 |
| 7 | 001 | 7 | 73.00 |
+------------+-----------+---------+--------+
10 rows in set (0.00 sec)
分组排序:
root@mysql 23:11: [test]> select dense_rank() over w as `dense_rank`,class_num,user_id,score from score where subject_name='Math' window w as ( partition by class_num order by score desc) ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 001 | 3 | 92.00 |
| 3 | 001 | 2 | 89.00 |
| 3 | 001 | 10 | 89.00 |
| 4 | 001 | 7 | 73.00 |
| 1 | 002 | 5 | 98.00 |
| 1 | 002 | 8 | 98.00 |
| 2 | 002 | 9 | 89.00 |
| 3 | 002 | 4 | 87.00 |
| 4 | 002 | 6 | 77.00 |
+------------+-----------+---------+--------+
10 rows in set (0.00 sec)
Mysql8.0之前
无分组排序:
root@mysql 23:09: [test]> select case when @s = score then @r WHEN @s := score then @r := @r + 1 end as `dense_rank`,class_num,user_id,score from score,(select @r := 0, @s = NULL) temp where subject_name = 'Math' order by score desc ;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 3 | 001 | 3 | 92.00 |
| 4 | 001 | 2 | 89.00 |
| 4 | 002 | 9 | 89.00 |
| 4 | 001 | 10 | 89.00 |
| 5 | 002 | 4 | 87.00 |
| 6 | 002 | 6 | 77.00 |
| 7 | 001 | 7 | 73.00 |
+------------+-----------+---------+--------+
10 rows in set, 3 warnings (0.00 sec)
分组排序:
root@mysql 23:10: [test]> select `dense_rank`,class_num,user_id,score from (select if(@class=class_num,case when @s = score then @r when @s := score then @r := @r + 1 end,@r:=1) as `dense_rank`,@class:=class_num,@s:=score,class_num,user_id,score FROM score, ( SELECT @r := 0, @s = NULL,@class=NULL ) temp where subject_name = 'Math' order by class_num,score desc) temp2;
+------------+-----------+---------+--------+
| dense_rank | class_num | user_id | score |
+------------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 001 | 3 | 92.00 |
| 3 | 001 | 2 | 89.00 |
| 3 | 001 | 10 | 89.00 |
| 4 | 001 | 7 | 73.00 |
| 1 | 002 | 5 | 98.00 |
| 1 | 002 | 8 | 98.00 |
| 2 | 002 | 9 | 89.00 |
| 3 | 002 | 4 | 87.00 |
| 4 | 002 | 6 | 77.00 |
+------------+-----------+---------+--------+
10 rows in set, 6 warnings (0.00 sec)
3、rank()实现
MySQL8.0
root@mysql 22:33: [test]> select rank() over w as `rank`,class_num,user_id,score from score where subject_name='Math' window w as ( order by score desc) ;
+------+-----------+---------+--------+
| rank | class_num | user_id | score |
+------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 4 | 001 | 3 | 92.00 |
| 5 | 001 | 2 | 89.00 |
| 5 | 002 | 9 | 89.00 |
| 5 | 001 | 10 | 89.00 |
| 8 | 002 | 4 | 87.00 |
| 9 | 002 | 6 | 77.00 |
| 10 | 001 | 7 | 73.00 |
+------+-----------+---------+--------+
10 rows in set (0.00 sec)
Mysql8.0之前
root@mysql 23:12: [test]> select `rank`,class_num,user_id,score from (SELECT @r:=if(@s = score,@r,@c) AS `rank`,@s:=score,@c:=@c+1,class_num,user_id,score from score, ( select @r := 0, @s = NULL,@c:=1 ) r where subject_name = 'Math' order by score desc) temp;
+------+-----------+---------+--------+
| rank | class_num | user_id | score |
+------+-----------+---------+--------+
| 1 | 001 | 1 | 100.00 |
| 2 | 002 | 5 | 98.00 |
| 2 | 002 | 8 | 98.00 |
| 4 | 001 | 3 | 92.00 |
| 5 | 001 | 2 | 89.00 |
| 5 | 002 | 9 | 89.00 |
| 5 | 001 | 10 | 89.00 |
| 8 | 002 | 4 | 87.00 |
| 9 | 002 | 6 | 77.00 |
| 10 | 001 | 7 | 73.00 |
+------+-----------+---------+--------+
10 rows in set, 5 warnings (0.00 sec)