【重新发现PostgreSQL之美】- 23 彭祖的长寿秘诀

本文涉及的产品
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
云原生数据库 PolarDB MySQL 版,通用型 2核4GB 50GB
简介: 大家好,这里是重新发现PostgreSQL之美 - 23 彭祖的长寿秘诀

背景


场景:

  • 论坛、短视频、社交、电商等推荐业务.
  • 在推荐的过程中要过滤已读列表.

挑战:

  • 已读列表巨大, 普通数据库没有多值列类型, 每个已读ID需要存储1条.
  • 使用not in过滤, 性能极差.

PG 解决方案:

  • 1、多值列, 一个用户已读只需要存储1
  • 2、datasketch 近似类型, 几KB空间可以存储上亿唯一值.
  • 3、压缩存储多值列 roaringbitmap类型, 效率和空间的平衡.
  • 4、使用partial index, 拆散过滤列表, 降低每次请求的无效数据.

例子


本文不谈推荐算法, 有兴趣的读者可以参考末尾文档或我的github, 为了观察方便, 本文尽量简化过程, 就用一个排序维度进行过滤.

传统方

1、已读列表, 存储已读视频ID


         
uid int,  -- UID
vid int   -- 已读VID
);

         
create index idx_u_readlist on u_readlist (uid);

2、视频ID, 存储推荐分数


         
vid int primary key,  -- 视频ID
score float4,   -- 推荐分数
ts timestamp
);

         
create index idx_vscore on vscore (score desc);

3、插入1000万条视频


         

4、插入1000个用户, 每个用户已读5万条视频.


         
insert into u_readlist select generate_series(1,1000),vid from a;

5、传统数据库没有多值列数据类型, 需要5000万条记录存储已读列表.


         

6、为1号用户推荐视频并过滤已读, 取出100vid.


         
where vid not in (select vid from u_readlist where uid=1)
order by score desc
limit 100;

         

         
QUERY PLAN
-------------------------------------------------------------------------------------------------
Limit  (cost=48446.32..48451.80 rows=100 width=8) (actual time=103.616..103.704 rows=100 loops=1)
->  Index Scan using idx_vscore on vscore  (cost=48446.32..322781.27 rows=4999930 width=8) (actual time=103.615..103.693 rows=100 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 50000
SubPlan 1
->  Bitmap Heap Scan on u_readlist  (cost=434.17..48321.27 rows=49846 width=4) (actual time=13.819..53.265 rows=50000 loops=1)
Recheck Cond: (uid = 1)
Heap Blocks: exact=50000
->  Bitmap Index Scan on idx_u_readlist  (cost=0.00..421.71 rows=49846 width=0) (actual time=4.610..4.610 rows=50000 loops=1)
Index Cond: (uid = 1)
Planning Time: 0.086 ms
Execution Time: 104.279 ms
(12 rows)

PG 解决方案1

1PG可以使用数组存储多个已读VID,记录数节省5万倍.


         
uid int,  -- UID
vid int[]   -- 已读VID
);

         
create index idx_p_u_readlist on p_u_readlist (uid);

25000万条压缩到1000


         

3、空间约仅占1/10


         

4、为1号用户推荐视频并过滤已读, 取出100vid.


         
where vid not in (select unnest(vid) from p_u_readlist where uid=1)
order by score desc
limit 100;

         

         

         
Limit  (cost=3.01..8.50 rows=100 width=8) (actual time=56.343..56.438 rows=100 loops=1)
->  Index Scan using idx_vscore on vscore  (cost=3.01..274337.96 rows=4999930 width=8) (actual time=56.342..56.427 rows=100 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 50000
SubPlan 1
->  ProjectSet  (cost=0.28..2.55 rows=10 width=4) (actual time=0.113..3.904 rows=50000 loops=1)
->  Index Scan using idx_p_u_readlist on p_u_readlist  (cost=0.28..2.49 rows=1 width=18) (actual time=0.018..0.022 rows=1 loops=1)
Index Cond: (uid = 1)
Planning Time: 0.112 ms
Execution Time: 56.814 ms
(10 rows)

PG 解决方案2

VID表存储已读UID, 而不是在UID中存储已读VID


         
vid int primary key,  -- 视频ID
score float4,   -- 推荐分数
uid int[],   -- 已读UID
ts timestamp
);

         
create index idx_p_vscore on p_vscore (score desc);

插入1000万条视频, score排名前5万个视频全部被1000个用户已读.


         

         

         
update p_vscore set
uid=array(select generate_series(1,1000))
where vid in (
select vid from p_vscore order by score desc limit 50000
);

         
public | p_vscore      | table             | postgres | unlogged    | heap          | 687 MB  |

1号用户推荐视频并过滤已读, 取出100vid


         
where uid is null or not uid @> array[1]
order by score desc
limit 100;

         

         
Limit  (cost=0.43..3.08 rows=100 width=8) (actual time=384.599..384.696 rows=100 loops=1)
->  Index Scan using idx_p_vscore on p_vscore  (cost=0.43..264791.46 rows=9999712 width=8) (actual time=384.598..384.685 rows=100 loops=1)
Filter: ((uid IS NULL) OR (NOT (uid @> '{1}'::integer[])))
Rows Removed by Filter: 50000
Planning Time: 0.062 ms
Execution Time: 384.713 ms
(6 rows)

和之前视频讲的一样, 为什么这种方式更慢了? 具有olap属性, 如下:
《重新发现PostgreSQL之美- 21 探访宇航员的食物》

PG 解决方案3

1、使用HLLroaringbitmap存储过滤列表, 可以节省列表的存储空间, 有些用户可能已读VID达到上百万, 需要大量的存储空间.


         

         
create unlogged table p1_u_readlist (
uid int,  -- UID
vid hll   -- 已读VID
);

         
create index idx_p1_u_readlist on p1_u_readlist (uid);

5000万条压缩到1000


         

         

         
public | p1_u_readlist | table             | postgres | unlogged    | heap          | 1376 kB |

1号用户推荐视频并过滤已读, 取出100vid.


         
where not exists (select 1 from p1_u_readlist where uid=1 and p1_u_readlist.vid = hll_add(p1_u_readlist.vid,hll_hash_integer(vscore.vid)))
order by score desc
limit 100;

         
Limit  (cost=0.71..5.23 rows=100 width=8) (actual time=602.038..632.683 rows=100 loops=1)
->  Nested Loop Anti Join  (cost=0.71..449335.43 rows=9949861 width=8) (actual time=602.037..632.672 rows=100 loops=1)
Join Filter: (p1_u_readlist.vid = hll_add(p1_u_readlist.vid, hll_hash_integer(vscore.vid, 0)))
Rows Removed by Join Filter: 100
->  Index Scan using idx_vscore on vscore  (cost=0.43..249335.74 rows=9999860 width=8) (actual time=0.010..37.183 rows=52695 loops=1)
->  Materialize  (cost=0.28..2.50 rows=1 width=1287) (actual time=0.000..0.000 rows=1 loops=52695)
->  Index Scan using idx_p1_u_readlist on p1_u_readlist  (cost=0.28..2.49 rows=1 width=1287) (actual time=0.019..0.021 rows=1 loops=1)
Index Cond: (uid = 1)
Planning Time: 0.146 ms
Execution Time: 632.714 ms
(10 rows)

和之前视频讲的一样, 为什么这种方式更慢了? 具有olap属性, 如下:
《重新发现PostgreSQL之美- 21 探访宇航员的食物》

PG 解决方案4

VID表存储已读UID, 而不是在UID中存储已读VID

使用HLLroaringbitmap存储过滤列表

VID表存储已读UID, 而不是在UID中存储已读VID


         
vid int primary key,  -- 视频ID
score float4,   -- 推荐分数
uid hll,   -- 已读UID
ts timestamp
);

         
create index idx_p1_vscore on p1_vscore (score desc);

插入1000万条视频, score排名前5万个视频全部被1000个用户已读.


         

         

         
update p1_vscore set
uid= (select hll_add_agg(hll_hash_integer(i)) from generate_series(1,1000) i)
where vid in (
select vid from p1_vscore order by score desc limit 50000
);

         
public | p_vscore      | table             | postgres | unlogged    | heap          | 687 MB  |
public | p1_vscore | table | postgres | unlogged    | heap          | 488 MB |

1号用户推荐视频并过滤已读, 取出100vid.


         
where uid is null or uid <> hll_add(uid,hll_hash_integer(1))
order by score desc
limit 100;

         

         
Limit  (cost=0.43..3.38 rows=100 width=8) (actual time=565.654..565.773 rows=100 loops=1)
->  Index Scan using idx_p1_vscore on p1_vscore  (cost=0.43..339509.68 rows=11541416 width=8) (actual time=565.653..565.763 rows=100 loops=1)
Filter: ((uid IS NULL) OR (uid <> hll_add(uid, '-8604791237420463362'::hll_hashval)))
Rows Removed by Filter: 50000
Planning Time: 0.090 ms
Execution Time: 565.795 ms
(6 rows)

PG 解决方案5

采用partial index, 注意看清楚, 这个方法直接将过滤已读从5万缩小到1000


         
uid int,  -- UID
hashid int,  -- partial index hashid
vid int[]   -- 已读VID
);

         
create index idx_pp_u_readlist on pp_u_readlist (uid,hashid);

假设hash 50, 5000万条压缩到50000,


         

         

         
public | pp_u_readlist | table | postgres | unlogged    | heap          | 229 MB |

视频表创建partial index


         
declare
begin
for i in 0..49 loop
execute format( 'create index p_vscore_p%s on vscore (score desc) where abs(mod(hashint4(vid),50))=%s', i,i);
end loop;
end;
$$;

         
List of relations
Schema |     Name      | Type  |  Owner   |  Table   | Persistence | Access method |  Size   | Description
--------+---------------+-------+----------+----------+-------------+---------------+---------+-------------
public | p_vscore_p0   | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p1   | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p10  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p11  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p12  | index | postgres | vscore   | unlogged    | btree         | 4432 kB |
public | p_vscore_p13  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p14  | index | postgres | vscore   | unlogged    | btree         | 4400 kB |
public | p_vscore_p15  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p16  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p17  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p18  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p19  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p2   | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p20  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p21  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p22  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p23  | index | postgres | vscore   | unlogged    | btree         | 4432 kB |
public | p_vscore_p24  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p25  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p26  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p27  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p28  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p29  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p3   | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p30  | index | postgres | vscore   | unlogged    | btree         | 4432 kB |
public | p_vscore_p31  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p32  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p33  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p34  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p35  | index | postgres | vscore   | unlogged    | btree         | 4400 kB |
public | p_vscore_p36  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p37  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p38  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p39  | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p4   | index | postgres | vscore   | unlogged    | btree         | 4432 kB |
public | p_vscore_p40  | index | postgres | vscore   | unlogged    | btree         | 4432 kB |
public | p_vscore_p41  | index | postgres | vscore   | unlogged    | btree         | 4400 kB |
public | p_vscore_p42  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p43  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p44  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p45  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p46  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p47  | index | postgres | vscore   | unlogged    | btree         | 4400 kB |
public | p_vscore_p48  | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p49  | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p5   | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p6   | index | postgres | vscore   | unlogged    | btree         | 4408 kB |
public | p_vscore_p7   | index | postgres | vscore   | unlogged    | btree         | 4416 kB |
public | p_vscore_p8   | index | postgres | vscore   | unlogged    | btree         | 4424 kB |
public | p_vscore_p9   | index | postgres | vscore   | unlogged    | btree         | 4408 kB |

1号用户推荐视频并过滤已读, 取出100vid.
每次获取时业务随机或采用round robin负载均衡方式从0-49个分片中获取100.


         
where vid not in (select unnest(vid) from pp_u_readlist where uid=1 and hashid=0)
and abs(mod(hashint4(vid),50))=0
order by score desc
limit 100;

         

         
Limit  (cost=3.01..157.54 rows=100 width=8) (actual time=1.223..1.331 rows=100 loops=1)
->  Index Scan using p_vscore_p0 on vscore  (cost=3.01..38635.65 rows=25000 width=8) (actual time=1.222..1.317 rows=100 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 973
SubPlan 1
->  ProjectSet  (cost=0.29..2.57 rows=10 width=4) (actual time=0.021..0.115 rows=973 loops=1)
->  Index Scan using idx_pp_u_readlist on pp_u_readlist  (cost=0.29..2.51 rows=1 width=18) (actual time=0.007..0.008 rows=1 loops=1)
Index Cond: ((uid = 1) AND (hashid = 0))
Planning Time: 0.515 ms
Execution Time: 1.366 ms
(10 rows)

其他方案

datasketch
roaringbitmap

总结

推荐查询性能提升:


         

已读列表存储空间节约:


         

1、多值列, 一个用户已读只需要存储1
2
datasketch 近似类型, KB空间可以存储上亿唯一值.
3
、压缩存储多值列roaringbitmap类型, 效率和空间的平衡.
4
、使用partial index, 拆散过滤列表, 降低每次请求的无效数据.

参考

PostgreSQL 大量IO扫描、计算浪费的优化- 推荐模块, 过滤已推荐. (热点用户、已推荐列表超大)

《推荐系统, 已阅读过滤, 大量CPUIO浪费的优化思路2 - partial index - hash 分片,降低过滤量》

PostgreSQL 推荐系统优化总计- 空间、时间、标量等混合多模查询场景, 大量已读过滤导致CPU IO剧增(类挖矿概率下降优化)

《近似查询处理(Approximate Query Processing) - DataSketches - 压缩、distinct、分位、rank、高
频柱状图(count distinct, quantiles, most-frequent items, joins, matrix computations, and graph analysis) 实时大数据近似分析》

PostgreSQL pg_roaringbitmap - 用户画像、标签、高效检索》

除了本文的应用, hll, roaringbitmap还经常被用在实时分析系统, 画像系统中, 支持高速的PV UV统计, 滑窗分析, 人群圈选等.

 

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
关系型数据库 分布式数据库 PolarDB
|
关系型数据库 分布式数据库 定位技术
PolarDB for PostgreSQL 开源必读手册-VACUUM处理(中)
PolarDB for PostgreSQL 开源必读手册-VACUUM处理
167 0
|
关系型数据库 分布式数据库 PolarDB
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
363 0
|
存储 缓存 关系型数据库
|
存储 SQL 并行计算
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍(中)
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍
419 0
|
存储 算法 安全
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍(下)
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍
379 0
|
关系型数据库 分布式数据库 开发工具
|
存储 关系型数据库 Linux
PolarDB for PostgreSQL 开源必读手册-PolarDB安装与配置(下)
PolarDB for PostgreSQL 开源必读手册-PolarDB安装与配置
695 0
|
存储 SQL 关系型数据库
|
关系型数据库 分布式数据库 定位技术