踏踏实实练习HSQ--day01

简介: 踏踏实实练习HSQ--day01

查询连续登陆3天以上用户(字节面试题)

image.png

create table userinfo2(
    `id` string,
    `date` string
)
row format delimited fields terminated by '\t';
insert overwrite table userinfo2
values ('1','2024-01-01 00:00:00'),
       ('1','2024-01-02 00:00:00'),
       ('1','2024-01-03 00:00:00'),
        ('2','2024-02-01 00:00:00'),
       ('2','2024-02-02 00:00:00'),
       ('3','2024-03-04 00:00:00'),
       ('3','2024-03-05 00:00:00'),
       ('3','2024-03-06 00:00:00'),
        ('3','2024-03-07 00:00:00');

第一步去重

select id,substring(`date`,1,10)
from userinfo2
group by id,substring(`date`,1,10)
去重能用group by,别用distinct,效率太低

image.png

select distinct id
    from
(select id,
       date_add(data,2) as nextday,
       lead(data,2,0) over (partition by id order by data) as nextorder
    from
(select id,substring(`date`,1,10) as data
from userinfo2
group by id,substring(`date`,1,10))t1)t2
where nextday=nextorder

image.png

image.png

select id,date1,count(*) as day_cnt
    from
(select id,data,
       date_add(data,-row_number() over (partition by id order by data))
as date1
    from
(select id,substring(`date`,1,10) as data
from userinfo2
group by id,substring(`date`,1,10))t1)t2
group by id,date1
having count(*)>3

image.png

查询连续登陆最大天数用户(字节面试题)

select max(day_cnt)
    from
(
    select
       id,date1,
      count(*) as day_cnt
       from
(select id,data,date_add(data,-row_number() over (partition by id order by data)) as date1
    from
(select id,substring(`date`,1,10) as data
from userinfo2
group by id,substring(`date`,1,10))t1)t2
group by id,date1
)t3

image.png

coalesce


波峰波谷

image.png

image.png

create table userinfo3(
    `id` string,
    `ds` string,
    `price` string
)
row format delimited fields terminated by '\t';

insert overwrite table userinfo3
values ('1','2024-01-01','10001'),
       ('1','2024-01-03','1001'),
       ('1','2024-01-02','10002'),
       ('1','2024-01-04','999'),
       ('1','2024-01-05','1002'),
       ('1','2024-01-06','1003'),
       ('1','2024-01-07','1004'),
       ('1','2024-01-08','998'),
         ('1','2024-01-09','997'),
        ('1','2024-01-10','996'),
        ('2','2024-01-01','10001'),
       ('2','2024-01-02','10002'),
        ('2','2024-01-03','10003'),
       ('2','2024-01-04','10002'),
        ('2','2024-01-05','1002'),
          ('2','2024-01-06','1003'),
       ('2','2024-01-07','1004'),
       ('2','2024-01-08','998'),
         ('2','2024-01-09','997'),
        ('2','2024-01-10','996');

偏移量函数:lag

语法:lag(col,n,default_val)

返回值:字段类型

说明:往前第n行数据。

偏移量函数:lead

语法:lead(col,n, default_val)

返回值:字段类型

说明:往后第n行数据。


select
id,ds,price,nvl( case when price>lag_price and price >lea_price then '波峰'
 when price<lag_price and price<lea_price then '波谷' end,0 )as price_type
       from
(select id,price,ds
,lag(price,1,null) over (partition by id order by  ds) as lag_price,
       lead(price,1,null) over (partition by id order by ds) as lea_price
from userinfo3)t1

image.png

相关文章
|
SQL 存储 BI
数仓学习---数仓开发之ADS层
数仓学习---数仓开发之ADS层
1077 1
|
SQL Unix
踏踏实实练习HSQ--day04
踏踏实实练习HSQ--day04
161 5
|
存储 分布式计算 算法
RoaringBitmap的原理与应用
RoaringBitmap的原理与应用
669 2
|
网络协议 数据安全/隐私保护 网络架构
软路由R4S+iStoreOS实现公网远程桌面局域网内电脑
软路由R4S+iStoreOS实现公网远程桌面局域网内电脑
858 0
|
存储 BI 关系型数据库
数仓学习---数仓开发之DWS层
数仓学习---数仓开发之DWS层
1389 4
|
存储 JSON 数据处理
数仓学习---数仓开发之DWD层
数仓学习---数仓开发之DWD
1389 6
数仓学习---数仓开发之DWD层
|
存储 数据挖掘 关系型数据库
数仓学习---6、数据仓库概述、 数据仓库建模概述、维度建模理论之事实表、维度建模理论之维度表
数仓学习---6、数据仓库概述、 数据仓库建模概述、维度建模理论之事实表、维度建模理论之维度表