PostgreSQL 社交类好友关系系统实践 - 正反向关系查询加速

本文涉及的产品
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介:

标签

PostgreSQL , 好友关系 , 正向关系 , 反向关系


背景

关系系统在很多设计类应用中非常常见,例如查询你有哪些好友,你是哪些人的好友。

又比如我LIKE了哪些人,哪些人LIKE了我。

比如在之前的"PCC大赛 - facebook\微博 like场景",也是类似的场景。

《三体高可用PCC大赛 - facebook\微博 like场景 - 数据库设计与性能压测》

那么如何加快正向和反向关系的查询呢?

在数据库中多行查询和单行查询的效率差别是非常大的,原因是多行查询需要运行更多的代码(例如DEFORM,这里有解释 《PostgreSQL 向量化执行插件(瓦片式实现-vops) 10x提速OLAP》 ),当然多行查询有多行查询的好处,比如返回少量记录时,多行比单行号。

为了提高返回与某个ID相关所有关系,我们可以采用将多行压缩为一行的方法。

1. 实时流式正反向关系生成方案

使用规则,将明细数据自动转换为多行数据。

好处,实时。

缺陷,每次都要更新合成表的大行,在多版本实现中,产生垃圾较多,并且在热点用户上容易出现瓶颈。(比如某些用户有很多人关注,那么它的关系会很大。)

实施举例:

1、创建原始关系表

create table t(uid1 int, uid2 int, primary key(uid1,uid2));  

2、创建正向和反向关系表,使用数组表示一对多的关系

create table t1(uid int primary key, fid int[]);  
  
create table t2(uid int primary key, fid int[]);  

3、创建一个intarray插件,用于对数组去重。

create extension intarray;  

4、创建规则,当数据写入原始关系表时,自动将关系合并到正向和反向关系表的记录中。

create rule r1 as on insert to t do also insert into t1 values (NEW.uid1, array[NEW.uid2]) on conflict (uid) do update set fid=uniq(sort(t1.fid||NEW.uid2));  
  
create rule r2 as on insert to t do also insert into t2 values (NEW.uid2, array[NEW.uid1]) on conflict (uid) do update set fid=uniq(sort(t2.fid||NEW.uid1));  

5、压测函数,即写入用户之间的关系。

create or replace function ins_rel( int, int ) returns void as $$  
declare  
begin  
  insert into t values ($1,$2);  
  return;  
  exception when others then  
    return;  
end;  
$$ language plpgsql strict;  

6、压测

vi test.sql  
\set uid1 random(1,100000)  
\set uid2 random(1,2000)  
select ins_rel(:uid1,:uid2);  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 120  

7、查看正向和反向关系记录例子

postgres=# select * from t1 limit 3;  
  uid  |                                                              fid                                                                 
-------+--------------------------------------------------------------------------------------------------------------------------------  
 27627 | {60,77,258,301,412,574,649,651,709,780,805,811,998,1031,1066,1151,1173,1251,1284,1392,1586,1602,1616,1659,1708,1879,1932,1982}  
 78532 | {106,288,416,474,566,765,853,1003,1154,1166,1352,1657,1689,1707,1722}  
 81317 | {14,109,258,275,293,499,586,593,633,716,720,732,766,807,927,1052,1508,1518,1633,1662,1791,1848,1885,1946,1951}  
(3 rows)  
  
postgres=# select * from t2 limit 1;  
uid | 19  
fid | {25,119,224,291,317,421,470,574,637,683,857,905,1131,1451,1474,1573,1606,1950,2039,2077,2297,2464,2693,2716,2722,2832,2871,3062,3158,3167,3217,3265,3288,3428,3602,3644,3660,3799,3850,3899,3906,3966,4134,4177,4268,4308,4411,4477,4509,4517,4522,4709,4736,4781,4892,5142,5214,5235,5363,5429,5473,5485,5491,5535,5752,5778,5828,5851,5950,5965,5988,6042,6071,6190,6238,6254,6343,6483,6555,6743,6820,6917,6980,7006,7032,7075,7205,7289,7410,7434,7618,7936,7987,7995,8102,8281,8370,8439,8463,8509,8622,8731,8786,8866,8899,9020,9033,9082,9142,9308,9406,9413,9829,9876,9976,10116,10360,10396,10428,10680,10710,10950,11170,11187,11197,11200,11243,11370,11395,11500,11506,11594,11695,11878,11963,12055,12077,12196,12211,12261,12332,12333,12341,12486,12627,12843,12864,12976,13037,13206,13267,13353,13354,13359,13383,13411,13633,13637,13778,13837,13860,13928,13968,13976,14036,14058,14150,14196,14284,14435,14724,14743,14774,14831,15086,15138,15367,15376,15407,15692,15755,15820,15870,15925,16057,16129,16288,16292,16388,16397,16426,16482,16812,17006,17013,17025,17159,17165,17402,17489,17494,17539,17883,17977,17991,18173,18180,18249,18253,18306,18487,18557,18730,18777,18881,18914,19102,19125,19245,19273,19448,19477,19534,19578,19751,19951,19973,20037,20043,20182,20244,20248,20388,20423,20588,20625,20727,20851,20884,20911,21300,21345,21401,21416,21442,21456,21465,21564,21747,21826,21841,21844,22051,22154,22168,22208,22215,22315,22322,22374,22400,22476,22617,22621,22626,22666,22786,22853,23014,23142,23539,23565,23580,23586,23620,23658,23976,24013,24014,24046,24059,24377,24546,24688,24723,24905,24951,25129,25192,25255,25445,25592,25723,25779,25814,25863,25880,25886,25963,26006,26020,26117,26273,26298,26318,26344,26347,26378,26399,26494,26561,26600,26607,26700,26721,26754,26755,26841,26849,26863,26884,26954,27071,27073,27140,27176,27237,27246,27256,27270,27315,27323,27434,27469,27510,27547,27565,27799,27918,27921,28045,28055,28168,28181,28310,28425,28531,28628,28769,28796,28924,28993,29009,29098,29145,29180,29238,29301,29542,29581,29625,29879,29953,29958,29994,30005,30029,30203,30276,30358,30423,30445,30472,30518,30581,30587,30593,30800,30855,30860,30885,30927,30946,31033,31256,31329,31416,31445,31554,31663,31762,31768,31807,31831,31845,31890,31891,32257,32329,32384,32864,32866,32942,33161,33299,33395,33437,33492,33828,33870,33876,33936,33989,34114,34236,34386,34752,34835,34847,35125,35402,35479,35492,35503,35543,35575,35756,35924,35987,36044,36191,36210,36212,36235,36313,36315,36341,36508,36568,36640,36676,36705,36800,36911,37002,37074,37075,37080,37084,37127,37149,37237,37248,37424,37445,37486,37547,37573,37625,37699,38083,38150,38254,38255,38297,38339,38436,38486,38500,38507,38563,38616,38750,38768,38790,38931,38946,38948,39079,39117,39141,39271,39400,39788,39802,39813,39853,40096,40122,40162,40225,40245,40256,40298,40329,40430,40452,40496,40556,40604,40631,40770,40794,40913,41017,41051,41435,41490,41523,41596,41699,41841,41844,41870,41969,42020,42056,42152,42190,42214,42279,42356,42376,42432,42623,42731,42810,43055,43107,43167,43221,43276,43389,43498,43505,43542,43549,43651,43802,43979,44033,44057,44167,44326,44378,44536,44611,44646,44816,44895,44901,44996,45076,45114,45148,45208,45227,45470,45482,45521,45565,45609,45728,45811,45894,45955,45989,46106,46424,46468,46500,46564,46577,46745,46792,46840,46945,46963,46965,47057,47172,47236,47503,47510,47541,47726,47752,47792,47856,47864,47966,48003,48169,48267,48282,48409,48436,48455,48505,48507,48964,49111,49221,49235,49364,49577,49701,50183,50270,50591,50727,50949,51014,51174,51284,51287,51350,51358,51426,51435,51533,51534,51606,51647,51757,51823,51830,51939,51949,51970,52262,52668,52684,52766,52926,52968,52982,53009,53054,53262,53370,53420,53526,53530,53573,53612,53687,53757,53854,53993,54047,54079,54093,54234,54236,54255,54259,54379,54397,54434,54545,54583,54595,54612,54636,54643,54678,54887,55034,55345,55378,55381,55428,55442,55510,55547,55750,55937,55947,56030,56067,56098,56172,56174,56330,56337,56438,56503,56610,56622,56825,56844,56881,56883,56976,57098,57292,57353,57551,57587,57588,57636,57662,57672,57685,57693,57787,57794,58064,58069,58122,58314,58318,58588,58663,58912,58990,59090,59095,59162,59311,59342,59431,59534,59536,59579,59760,59819,59865,59868,60099,60229,60242,60338,60342,60354,60373,60374,60415,60512,60513,60518,60531,60589,60757,60803,61104,61197,61201,61215,61253,61439,61446,61513,61553,61686,61696,61874,62062,62274,62348,62449,62471,62490,62584,62654,62671,62763,62766,62808,63018,63044,63142,63282,63295,63363,63375,63419,63427,63431,63577,63590,63636,63642,63646,63678,63706,63816,63862,63915,63978,64039,64104,64131,64143,64323,64365,64394,64397,64449,64502,64535,64671,64849,64855,64971,65004,65039,65165,65228,65239,65330,65407,65446,65567,65691,65709,65741,65768,65874,65876,65882,65955,66068,66178,66410,66423,66425,66488,66493,66540,66770,67017,67284,67360,67509,67537,67579,67663,67691,67750,67821,67855,67932,67964,68023,68031,68076,68195,68301,68525,68618,68684,68779,68789,68815,68833,68876,68955,69075,69145,69160,69187,69244,69500,69541,69585,69599,69691,69698,69747,69766,69782,69864,70039,70105,70151,70225,70238,70248,70256,70325,70378,70392,70495,70540,70699,70913,70919,70957,71047,71093,71149,71349,71489,71583,71639,71702,71841,71893,71922,71970,71971,71983,72071,72074,72184,72229,72343,72404,72429,72446,72536,72660,72662,72675,72714,72718,72858,72903,72986,73042,73069,73097,73102,73160,73174,73313,73417,73426,73464,73601,73778,73867,73966,73978,74057,74189,74253,74257,74270,74357,74441,74450,74484,74574,74590,74863,74896,74903,74970,75024,75048,75076,75161,75251,75285,75287,75427,75488,75638,75704,75999,76077,76117,76169,76252,76456,76745,76810,77058,77109,77110,77194,77401,77411,77427,77530,77565,77610,77634,77663,77765,77776,77852,77861,77929,77954,78007,78320,78326,78348,78365,78419,78441,78588,78663,78733,78745,78838,78887,78896,78964,79140,79142,79569,79772,79834,79846,79855,80089,80169,80297,80521,80535,80655,80753,80793,80890,81039,81065,81114,81225,81397,81408,81461,81463,81560,81575,81607,81644,81673,81686,81702,81756,81762,81772,81866,82090,82200,82247,82273,82320,82432,82589,82615,82659,82680,83078,83100,83127,83249,83578,83607,83609,83714,83792,83817,83894,83952,84016,84086,84088,84182,84299,84469,84487,84529,84544,84574,84607,84701,84721,84764,84775,84832,85044,85098,85334,85547,85574,85740,85809,85824,85875,86070,86078,86277,86342,86445,86536,86628,86651,86775,86801,86961,86967,87021,87242,87316,87318,87480,87498,87543,87683,87700,87759,87780,87839,87883,88053,88187,88198,88322,88533,88535,88752,88765,89034,89053,89115,89168,89191,89401,89624,89692,89771,89851,89896,90051,90056,90072,90077,90079,90180,90187,90499,90500,90616,90627,90717,91002,91075,91279,91295,91314,91343,91365,91498,91502,91550,91561,91587,91671,91857,91912,92085,92106,92342,92372,92501,92717,92967,93063,93084,93092,93113,93244,93320,93433,93541,93680,93684,93719,93809,94049,94077,94139,94221,94277,94309,94530,94555,94686,94690,94920,94936,94996,95219,95258,95333,95465,95614,95640,95753,95755,95779,95906,96158,96183,96225,96339,96347,96577,96648,96870,96894,97030,97121,97152,97162,97174,97237,97542,97599,97607,97686,97705,97708,97718,97728,97817,97990,97997,98172,98246,98313,98452,98521,98581,98636,98831,98870,98926,98962,98976,99055,99112,99260,99419,99462,99470,99546,99573,99720,99777,99906}  

8、对比正反向关系是否与原始表一致。对比HASH即可,表示完全一致。

postgres=# select sum(hashtext(t.*::text)) from t;  
      sum        
---------------  
 -137085631634  
(1 row)  
  
postgres=# select sum(hashtext) from (select hashtext((t1.uid, unnest(t1.fid))::text) from t1) t;  
      sum        
---------------  
 -137085631634  
(1 row)  
  
postgres=# select sum(hashtext) from (select hashtext((unnest(t2.fid), t2.uid)::text) from t2) t;  
      sum        
---------------  
 -137085631634  
(1 row)  

2. 批量异步流式正反向关系生成方案

由于第一种方法的缺陷,如容易有热点用户,容易导致IO放大,容易导致关系写入RT升高。

我们这里可以使用批量合并的方法,类似我在之前讲的某个手机厂商的手机用户画像系统,贴、删、改标签,使用日志的形式生成数据,然后对FEED数据阅后即焚,合并到标签表。

pic

《PostgreSQL手机行业经营分析、决策系统设计 - 实时圈选、透视、估算》

本例一样,可以使用同样的方法,将like, unlike的动作,写成日志,然后再批量的消费这个日志,并将关系更新到正向、反向关系表。

批量阅后即焚,还有其他的一些例子和性能参考指标。

《阿里云RDS PostgreSQL varbitx实践 - 流式标签 (阅后即焚流式批量计算) - 万亿级,任意标签圈人,毫秒响应》

《HTAP数据库 PostgreSQL 场景与性能测试之 32 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(JSON + 函数流式计算)》

《HTAP数据库 PostgreSQL 场景与性能测试之 31 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(读写大吞吐并测)》

《HTAP数据库 PostgreSQL 场景与性能测试之 27 - (OLTP) 物联网 - FEED日志, 流式处理 与 阅后即焚 (CTE)》

《在PostgreSQL中实现update | delete limit - CTID扫描实践 (高效阅后即焚)》

正向、反向关系查询性能对比

对比原始表、正向、反向关系表查询性能对比。

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 56 -j 56 -T 120  

1、原始表,增加单字段索引。

postgres=# create index idx_t_1 on t(uid1);  
CREATE INDEX  
postgres=# create index idx_t_2 on t(uid2);  
CREATE INDEX  

2、原始表,正向关系查询性能

\set uid1 random(1,100000)  
\set uid2 random(1,2000)  
select * from t where uid1=:uid1;  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 73916179  
latency average = 0.091 ms  
latency stddev = 0.014 ms  
tps = 615905.000776 (including connections establishing)  
tps = 615979.071532 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set uid1 random(1,100000)  
         0.001  \set uid2 random(1,2000)  
         0.089  select * from t where uid1=:uid1;  

3、原始表,反向关系查询性能

\set uid1 random(1,100000)  
\set uid2 random(1,2000)  
select * from t where uid2=:uid2;  
  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 2784841  
latency average = 2.413 ms  
latency stddev = 0.116 ms  
tps = 23204.361967 (including connections establishing)  
tps = 23207.284520 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set uid1 random(1,100000)  
         0.001  \set uid2 random(1,2000)  
         2.411  select * from t where uid2=:uid2;  

4、正向关系表,查询性能

\set uid1 random(1,100000)  
\set uid2 random(1,2000)  
select * from t1 where uid=:uid1;  
  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 85533010  
latency average = 0.079 ms  
latency stddev = 0.012 ms  
tps = 712676.961185 (including connections establishing)  
tps = 712758.158291 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set uid1 random(1,100000)  
         0.001  \set uid2 random(1,2000)  
         0.077  select * from t1 where uid=:uid1;  

5、反向关系表,查询性能

\set uid1 random(1,100000)  
\set uid2 random(1,2000)  
select * from t2 where uid=:uid2;  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 56  
number of threads: 56  
duration: 120 s  
number of transactions actually processed: 25649777  
latency average = 0.262 ms  
latency stddev = 0.037 ms  
tps = 213695.286456 (including connections establishing)  
tps = 213719.115995 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set uid1 random(1,100000)  
         0.001  \set uid2 random(1,2000)  
         0.260  select * from t2 where uid=:uid2;  
对比测试 正向关系TPS 反向关系TPS
原始方法 61.6万 2.3万
正反向表数组查询 71.3万 21.4万

关系记录数越多,使用数组存储正反向关系的性能提升越明显。

小结

关系系统在很多设计类应用中非常常见,例如查询你有哪些好友,你是哪些人的好友。

使用原始关系进行查询,如果要批量取出所有相关关系,由于记录数比较多,可能导致整体的RT比较高,TPS吞吐较低。

我们使用实时流计算、批量流量计算的方法,可以将关系整合成正、反向的关系,一个用户一条记录,使用数组存储一对多的关系。

整合后,性能有非常明显的提升。

参考

《三体高可用PCC大赛 - facebook\微博 like场景 - 数据库设计与性能压测》

《阿里云RDS PostgreSQL varbitx实践 - 流式标签 (阅后即焚流式批量计算) - 万亿级,任意标签圈人,毫秒响应》

《HTAP数据库 PostgreSQL 场景与性能测试之 32 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(JSON + 函数流式计算)》

《HTAP数据库 PostgreSQL 场景与性能测试之 31 - (OLTP) 高吞吐数据进出(堆存、行扫、无需索引) - 阅后即焚(读写大吞吐并测)》

《HTAP数据库 PostgreSQL 场景与性能测试之 27 - (OLTP) 物联网 - FEED日志, 流式处理 与 阅后即焚 (CTE)》

《在PostgreSQL中实现update | delete limit - CTID扫描实践 (高效阅后即焚)》

《PostgreSQL手机行业经营分析、决策系统设计 - 实时圈选、透视、估算》

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
1月前
|
SQL 关系型数据库 数据库
PostgreSQL性能飙升的秘密:这几个调优技巧让你的数据库查询速度翻倍!
【10月更文挑战第25天】本文介绍了几种有效提升 PostgreSQL 数据库查询效率的方法,包括索引优化、查询优化、配置优化和硬件优化。通过合理设计索引、编写高效 SQL 查询、调整配置参数和选择合适硬件,可以显著提高数据库性能。
184 1
|
4月前
|
缓存 关系型数据库 数据库
PostgreSQL 查询性能
【8月更文挑战第5天】PostgreSQL 查询性能
82 8
|
6月前
|
SQL 关系型数据库 数据库
Python执行PostgreSQL数据库查询语句,并打印查询结果
本文介绍了如何使用Python连接和查询PostgreSQL数据库。首先,确保安装了`psycopg2`库,然后创建数据库连接函数。接着,展示如何编写SQL查询并执行,例如从`employees`表中选取所有记录。此外,还讨论了处理查询结果、格式化输出和异常处理的方法。最后,提到了参数化查询和事务处理以增强安全性及确保数据一致性。
Python执行PostgreSQL数据库查询语句,并打印查询结果
|
5月前
|
Java 关系型数据库 API
使用Spring Boot和PostgreSQL构建高级查询
使用Spring Boot和PostgreSQL构建高级查询
|
6月前
|
SQL 关系型数据库 数据库
Python查询PostgreSQL数据库
木头左教你如何用Python连接PostgreSQL数据库:安装`psycopg2`库,建立连接,执行SQL脚本如创建表、插入数据,同时掌握错误处理和事务管理。别忘了性能优化,利用索引、批量操作提升效率。下期更精彩!💡 csvfile
Python查询PostgreSQL数据库
|
6月前
|
自然语言处理 关系型数据库 数据库
技术经验解读:【转】PostgreSQL的FTI(TSearch)与中文全文索引的实践
技术经验解读:【转】PostgreSQL的FTI(TSearch)与中文全文索引的实践
71 0
|
7月前
|
SQL 人工智能 Oracle
PostgreSQL 递归查询(含层级和结构)
PostgreSQL 递归查询(含层级和结构)
|
7月前
|
SQL 关系型数据库 数据库
SQL 42501: Postgresql查询中的权限不足错误
SQL 42501: Postgresql查询中的权限不足错误
464 0
|
7月前
|
SQL 关系型数据库 分布式数据库
在PolarDB for PostgreSQL中,你可以使用LIKE运算符来实现类似的查询功能,而不是使用IF函数
在PolarDB for PostgreSQL中,你可以使用LIKE运算符来实现类似的查询功能,而不是使用IF函数
92 7
|
7月前
|
存储 关系型数据库 分布式数据库
PolarDB for PostgreSQL查询问题之条件查询失败如何解决
PolarDB for PostgreSQL是基于PostgreSQL开发的一款云原生关系型数据库服务,它提供了高性能、高可用性和弹性扩展的特性;本合集将围绕PolarDB(pg)的部署、管理和优化提供指导,以及常见问题的排查和解决办法。

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版