PostgreSQL的hash索引是否有用?

本文涉及的产品
云原生数据库 PolarDB MySQL 版,通用型 2核4GB 50GB
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
简介: 根据PostgreSQL的手册,PostgreSQL中hash索引有很大的缺陷,不推荐使用。 http://58.58.27.50:8079/doc/html/9.3.1_zh/indexes-types.html -----------------------------------------------------------------------------Hash 索引操作目前没有记录 WAL 日志,因此如果数据库崩溃有未写入的改变, 我们可能需要用REINDEX重建 Hash 索引。
根据PostgreSQL的手册,PostgreSQL中hash索引有很大的缺陷,不推荐使用。

http://58.58.27.50:8079/doc/html/9.3.1_zh/indexes-types.html
-----------------------------------------------------------------------------
Hash 索引操作目前没有记录 WAL 日志,因此如果数据库崩溃有未写入的改变, 我们可能需要用REINDEX重建 Hash 索引。另外, 对hash索引的改变在初始的基础备份后不是基于流复制或者基于文件复制的, 所以对于随后使用它们的查询会给出错误的回复。因为这些原因, 我们并不鼓励使用 Hash 索引。
-----------------------------------------------------------------------------

并且hash索引只支持"等于"查询,而PostgreSQL中作为默认索引类型的btree能支持包括“等于”在内的各种比较操作,功能上覆盖了hash索引。因此,在PostgreSQL中,对常规数据类型,btree索引是不二之选。(相比Oracle中有过于丰富的索引类型可选择,在PostgreSQL中建索引是不是更轻松一点?)

那么,既然不推荐使用hash索引,PostgreSQL为什么还要实现hash索引呢?
PostgreSQL的hash索引使用的线性hash算法,可以动态扩展桶数,理论上的查询复杂度接近O(1),这是btree无法比拟的。下面做个实测,也证明了在大数量的情况下,hash索引的性能确实明显优于btree。所以在某些极少的特殊场景下,hash索引也还是可以考虑的。

环境

测试环境在Linux虚拟机上
宿主机
OS:Win7 64bit
CPU:AMD Athlon II X4 640 3.0G
内存:6G
硬盘:Apacer A S510S 128G SATA

虚拟机
OS:CentOS 6.5
CPU:x4核
MEM:2G
PostgreSQL:9.3( shared_buffers=128M )

测试方法

使用pgbench的TCP-B和Select Only测试对比性能。
分别建2个测试数据库db1和db2。db1是原始的pgbench测试数据库,即btree索引。db2则通过下面的方法修改为hash索引。

  1. [chenhj@hanode1 ~]$ psql db2
  2. psql (9.3.4)
  3. Type "help" for help.

  4. db2=# alter table pgbench_accounts DROP CONSTRAINT pgbench_accounts_pkey;
  5. ALTER TABLE
  6. db2=# \d pgbench_accounts
  7.    Table "public.pgbench_accounts"
  8.   Column | Type | Modifiers
  9. ----------+---------------+-----------
  10.  aid | integer | not null
  11.  bid | integer |
  12.  abalance | integer |
  13.  filler | character(84) |

  14. db2=# create index on pgbench_accounts using hash(aid);
  15. CREATE INDEX

测试结果

sclae=1时,差异不大;sclae=100时,hash几乎是btree的两倍性能。可见数据量越大,hash的性能优势越明显。


sclae=1时,无太大差异
数据略


sclae=10时,大概快了20%

点击(此处)折叠或打开

  1. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db1
  2. transaction type: TPC-B (sort of)
  3. scaling factor: 10
  4. query mode: simple
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 3 s
  8. number of transactions actually processed: 641
  9. tps = 213.418248 (including connections establishing)
  10. tps = 213.716676 (excluding connections establishing)
  11. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 db2
  12. transaction type: TPC-B (sort of)
  13. scaling factor: 10
  14. query mode: simple
  15. number of clients: 1
  16. number of threads: 1
  17. duration: 3 s
  18. number of transactions actually processed: 653
  19. tps = 217.440022 (including connections establishing)
  20. tps = 217.770471 (excluding connections establishing)

  21. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
  22. transaction type: TPC-B (sort of)
  23. scaling factor: 10
  24. query mode: simple
  25. number of clients: 16
  26. number of threads: 1
  27. duration: 3 s
  28. number of transactions actually processed: 2064
  29. tps = 683.721499 (including connections establishing)
  30. tps = 701.078622 (excluding connections establishing)
  31. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
  32. transaction type: TPC-B (sort of)
  33. scaling factor: 10
  34. query mode: simple
  35. number of clients: 16
  36. number of threads: 1
  37. duration: 3 s
  38. number of transactions actually processed: 3278
  39. tps = 1085.589348 (including connections establishing)
  40. tps = 1110.521633 (excluding connections establishing)

  41. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  42. transaction type: SELECT only
  43. scaling factor: 10
  44. query mode: simple
  45. number of clients: 1
  46. number of threads: 1
  47. duration: 3 s
  48. number of transactions actually processed: 4922
  49. tps = 1640.242391 (including connections establishing)
  50. tps = 1642.450916 (excluding connections establishing)
  51. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  52. transaction type: SELECT only
  53. scaling factor: 10
  54. query mode: simple
  55. number of clients: 1
  56. number of threads: 1
  57. duration: 3 s
  58. number of transactions actually processed: 4771
  59. tps = 1590.028578 (including connections establishing)
  60. tps = 1592.541658 (excluding connections establishing)

  61. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1
  62. transaction type: SELECT only
  63. scaling factor: 10
  64. query mode: simple
  65. number of clients: 16
  66. number of threads: 1
  67. duration: 3 s
  68. number of transactions actually processed: 32188
  69. tps = 10724.978992 (including connections establishing)
  70. tps = 10969.036021 (excluding connections establishing)

  71. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2
  72. transaction type: SELECT only
  73. scaling factor: 10
  74. query mode: simple
  75. number of clients: 16
  76. number of threads: 1
  77. duration: 3 s
  78. number of transactions actually processed: 33367
  79. tps = 11092.571963 (including connections establishing)
  80. tps = 11339.707975 (excluding connections establishing)

sclae=100时,大概快了1倍

点击(此处)折叠或打开

  1. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  2. transaction type: SELECT only
  3. scaling factor: 100
  4. query mode: simple
  5. number of clients: 1
  6. number of threads: 1
  7. duration: 3 s
  8. number of transactions actually processed: 1324
  9. tps = 441.307443 (including connections establishing)
  10. tps = 442.905056 (excluding connections establishing)
  11. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  12. transaction type: SELECT only
  13. scaling factor: 100
  14. query mode: simple
  15. number of clients: 1
  16. number of threads: 1
  17. duration: 3 s
  18. number of transactions actually processed: 3226
  19. tps = 1075.038414 (including connections establishing)
  20. tps = 1076.599774 (excluding connections establishing)

  21. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db1
  22. transaction type: TPC-B (sort of)
  23. scaling factor: 100
  24. query mode: simple
  25. number of clients: 16
  26. number of threads: 1
  27. duration: 3 s
  28. number of transactions actually processed: 2194
  29. tps = 725.580208 (including connections establishing)
  30. tps = 741.318899 (excluding connections establishing)
  31. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 db2
  32. transaction type: TPC-B (sort of)
  33. scaling factor: 100
  34. query mode: simple
  35. number of clients: 16
  36. number of threads: 1
  37. duration: 3 s
  38. number of transactions actually processed: 3174
  39. tps = 1049.927110 (including connections establishing)
  40. tps = 1073.139131 (excluding connections establishing)

  41. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db1
  42. transaction type: SELECT only
  43. scaling factor: 100
  44. query mode: simple
  45. number of clients: 1
  46. number of threads: 1
  47. duration: 3 s
  48. number of transactions actually processed: 1385
  49. tps = 461.481151 (including connections establishing)
  50. tps = 462.109984 (excluding connections establishing)
  51. [chenhj@hanode1 ~]$ pgbench -n -l -c 1 -T 3 -S db2
  52. transaction type: SELECT only
  53. scaling factor: 100
  54. query mode: simple
  55. number of clients: 1
  56. number of threads: 1
  57. duration: 3 s
  58. number of transactions actually processed: 3527
  59. tps = 1175.124543 (including connections establishing)
  60. tps = 1177.144768 (excluding connections establishing)

  61. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db1
  62. transaction type: SELECT only
  63. scaling factor: 100
  64. query mode: simple
  65. number of clients: 16
  66. number of threads: 1
  67. duration: 3 s
  68. number of transactions actually processed: 6313
  69. tps = 2092.075168 (including connections establishing)
  70. tps = 2137.468368 (excluding connections establishing)
  71. [chenhj@hanode1 ~]$ pgbench -n -l -c 16 -T 3 -S db2
  72. transaction type: SELECT only
  73. scaling factor: 100
  74. query mode: simple
  75. number of clients: 16
  76. number of threads: 1
  77. duration: 3 s
  78. number of transactions actually processed: 14414
  79. tps = 4775.127727 (including connections establishing)
  80. tps = 4886.910106 (excluding connections establishing)

2种索引的大小差不多

  1. db1=# select pg_table_size('pgbench_accounts_pkey'::regclass);
  2.  pg_table_size
  3. ---------------
  4.      224641024
  5. (1 row)

  6. db2=# select pg_table_size('pgbench_accounts_aid_idx'::regclass);
  7.  pg_table_size
  8. ---------------
  9.      268451840
  10. (1 row)


相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
5月前
|
监控 关系型数据库 数据库
PostgreSQL的索引优化策略?
【8月更文挑战第26天】PostgreSQL的索引优化策略?
118 1
|
5月前
|
SQL 关系型数据库 MySQL
SQL Server、MySQL、PostgreSQL:主流数据库SQL语法异同比较——深入探讨数据类型、分页查询、表创建与数据插入、函数和索引等关键语法差异,为跨数据库开发提供实用指导
【8月更文挑战第31天】SQL Server、MySQL和PostgreSQL是当今最流行的关系型数据库管理系统,均使用SQL作为查询语言,但在语法和功能实现上存在差异。本文将比较它们在数据类型、分页查询、创建和插入数据以及函数和索引等方面的异同,帮助开发者更好地理解和使用这些数据库。尽管它们共用SQL语言,但每个系统都有独特的语法规则,了解这些差异有助于提升开发效率和项目成功率。
553 0
|
5月前
|
关系型数据库 数据库 PostgreSQL
PostgreSQL索引维护看完这篇就够了
PostgreSQL索引维护看完这篇就够了
370 0
|
存储 关系型数据库 数据库
PostgreSQL技术大讲堂 - 第28讲:索引内部结构
从零开始学PostgreSQL技术大讲堂 - 第28讲:索引内部结构
795 2
|
关系型数据库 Go 数据库
《提高查询速度:PostgreSQL索引实用指南》
《提高查询速度:PostgreSQL索引实用指南》
600 0
|
关系型数据库 分布式数据库 数据库
PolarDB for PostgreSQL 14:全局索引
PolarDB for PostgreSQL 14 相较于 PostgreSQL 14,提供了更多企业级数据库的特性。本实验将体验其中的全局索引功能。
|
弹性计算 关系型数据库 OLAP
AnalyticDB PostgreSQL版向量索引查询
本案例对比了传统查询和使用向量索引执行查询的执行时间,助您体验使用向量索引查询带来的高效和快捷。
|
存储 SQL 关系型数据库
PostgreSQL插件HypoPG:支持虚拟索引
PostgreSQL插件HypoPG:支持虚拟索引
422 0
|
存储 缓存 关系型数据库
PostgreSQL 14新特性--减少索引膨胀
PostgreSQL 14新特性--减少索引膨胀
499 0
|
关系型数据库 PostgreSQL 索引
PostgreSQL通过索引获取heap tuple解析
PostgreSQL通过索引获取heap tuple解析
184 0