Partitioned Index - Alibaba Cloud RDS PostgreSQL Best Practices-阿里云开发者社区

Partitioned Index - Alibaba Cloud RDS PostgreSQL Best Practices

2018-01-10 2118

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云原生数据库 PolarDB 分布式版，标准版 2核8GB

云原生数据库 PolarDB PostgreSQL 版，标准版 2核4GB 50GB

云原生数据库 PolarDB MySQL 版，通用型 2核8GB 50GB

简介： When should you partition a table in your database? Learn how to split tables with partial index.

DataWarehouse_FriendlyDatabaseDesign

Background

When you have a very large table, you may want to partition it. For example, a user table can be split into many tables by user ID (hash) or by range.

In another example, a behavior data table can be partitioned by time, and be split into multiple tables.

Advantages of table partitioning:

Table partitioning allows tables to be stored in different table partitions that correlate to block devices. For example, historical data, which usually contains a huge amount of data with low page views can be stored in a table partition on your HDD. However, active data can be stored in table partitions on an SSD.
Tables are easier to maintain after partitioning. For example, you can just use Drop Table to delete historical data without using REDO.

In fact, indexes can also be partitioned, e.g. partitioning by user ID hash or by time. Aside from having the same advantages as table partitions, index partitions also feature the following advantages:

You do not need to create indexes for data that you do not search for. 1.Taking a user table as an example, we only search for active users and do not search for inactive users, so we can create indexes only for active users.
For data with different structures, you can use different index interfaces. 2.For example, when data distribution in a table is uneven, some values appear frequently, while other values appear less frequently. We can use bitmap or gin indexes for values that appear frequently, and use B-tree indexes for values that do not appear frequently.

Let’s move to details on how to implement index partitioning through PostgreSQL.

Global Index

We usually create a global index. This implementation is relatively easy, but it can make our database less efficient if we do not use partitions.

create table test(id int, crt_time timestamp, info text);  
  
create index idx_test_id on test(id);

Primary Partition Index

We can add primary partition indexes to split our table into multiple parts. In this example, we split the table based on crt_time.

create table test(id int, crt_time timestamp, info text);  
  
Partitioned indexes are as follows  
  
create index idx_test_id_1 on test(id) where crt_time between '2017-01-01' and '2017-02-01';  
create index idx_test_id_2 on test(id) where crt_time between '2017-02-01' and '2017-03-01';  
...  
create index idx_test_id_12 on test(id) where crt_time between '2017-12-01' and '2018-01-01';

Multilayer Partition Index

We can further divide the partitioned tables into smaller ones by adding another layer of index. In this example, we add the province_code index to the crt_time index to create a multilayer partition index. Now we have created 6 partitions from the original table.

create table test(id int, crt_time timestamp, province_code int, info text);  
  
Partitioned indexes are as follows  
  
create index idx_test_id_1_1 on test(id) where crt_time between '2017-01-01' and '2017-02-01' and province_code=1;  
create index idx_test_id_1_2 on test(id) where crt_time between '2017-02-01' and '2017-03-01' and province_code=1;  
...  
create index idx_test_id_1_12 on test(id) where crt_time between '2017-12-01' and '2018-01-01' and province_code=1;  
  
....  
  
create index idx_test_id_2_1 on test(id) where crt_time between '2017-01-01' and '2017-02-01' and province_code=2;  
create index idx_test_id_2_2 on test(id) where crt_time between '2017-02-01' and '2017-03-01' and province_code=2;  
...  
create index idx_test_id_2_12 on test(id) where crt_time between '2017-12-01' and '2018-01-01' and province_code=2;

Example of Partitioning Unevenly Distributed Data

We can also apply gin and B-tree indexes to speed up the operation of our table partitions.

create table test(uid int, crt_time timestamp, province_code int, info text);  
  
create index idx_test_1 on test using gin(uid) where uid&lt;1000;     -- This section contains a large number of repeated values (high-frequency values), so we can use gin index to accelerate the operation  
create index idx_test_1 on test using btree(uid) where uid&gt;=1000;  -- This section contains low-frequency values, so we can use btree index to accelerate the operation

Summary

1.When searching for data, you can use index partitioning conditions, index fields and the corresponding operators to search with partitioned indexes.

2.Partitioned indexes are generally used in searches with multiple conditions, and uses the partitioning condition as one of the search conditions. Of course, it can also be used when searching a single column

3.PostgreSQL supports not only partitioned indexes, but also expression indexes and functional indexes.

Welcome to Alibaba Cloud RDS PostgreSQL to learn more.

Partitioned Index - Alibaba Cloud RDS PostgreSQL Best Practices

Background

Advantages of table partitioning:

Global Index

Primary Partition Index

Multilayer Partition Index

Example of Partitioning Unevenly Distributed Data

Summary

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

直播

下载

镜像站

技术资料

Partitioned Index - Alibaba Cloud RDS PostgreSQL Best Practices

Background

Advantages of table partitioning:

Global Index

Primary Partition Index

Multilayer Partition Index

Example of Partitioning Unevenly Distributed Data

Summary

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像