背景
凡是支持HASH函数,以及相等operator的类型,都可以使用bloom filter index .
扩展方法见本文。
原文
https://obartunov.livejournal.com/201027.html
Bloom index by default works for int4 and text, but other types with hash function and equality operator could be supported.
Just use opclass interface, for example, for type bigint
create extension bloom;
postgres=# select * from pg_opclass where opcname='bigint_ops';
(0 rows)
创建bigint的bloom filter index支持。
CREATE OPERATOR CLASS bigint_ops
DEFAULT FOR TYPE bigint USING bloom AS
OPERATOR 1 = (bigint, bigint),
FUNCTION 1 hashint8(bigint);
postgres=# select * from pg_opclass where opcname='bigint_ops';
(1 row)
Now, you can build bloom index for bigint data type.
Data types, which could be supported by bloom index.
查询可以支持bloom filter的类型,HASH函数
凡是支持HASH函数,以及相等operator操作符的类型,都可以使用bloom filter index .
postgres=# select oid,* from pg_am;
oid | amname | amhandler | amtype |
---|---|---|---|
403 | btree | bthandler | i |
405 | hash | hashhandler | i |
783 | gist | gisthandler | i |
2742 | gin | ginhandler | i |
4000 | spgist | spghandler | i |
3580 | brin | brinhandler | i |
18204 | rum | rumhandler | i |
136050 | bloom | blhandler | i |
(8 rows)
查询可以支持bloom filter的类型,HASH函数
SELECT oc.opcintype::regtype, p.amproc FROM pg_opclass oc
JOIN pg_amproc p ON p.amprocfamily = oc.opcfamily
WHERE oc.opcmethod = 405 -- hash am
AND oc.opcdefault -- 默认proc for this am
-- https://www.postgresql.org/docs/devel/static/xindex.html Strategies number
AND p.amprocnum = 1
AND p.amproclefttype = oc.opcintype
AND p.amprocrighttype = oc.opcintype;
opcintype | amproc
character | hashbpchar |
"char" | hashchar |
date | hashint4 |
anyarray | hash_array |
real | hashfloat4 |
double precision | hashfloat8 |
inet | hashinet |
smallint | hashint2 |
integer | hashint4 |
bigint | hashint8 |
interval | interval_hash |
macaddr | hashmacaddr |
name | hashname |
oid | hashoid |
oidvector | hashoidvector |
text | hashtext |
time without time zone | time_hash |
numeric | hash_numeric |
timestamp with time zone | timestamp_hash |
time with time zone | timetz_hash |
timestamp without time zone | timestamp_hash |
boolean | hashchar |
bytea | hashvarlena |
xid | hashint4 |
cid | hashint4 |
abstime | hashint4 |
reltime | hashint4 |
aclitem | hash_aclitem |
uuid | uuid_hash |
pg_lsn | pg_lsn_hash |
macaddr8 | hashmacaddr8 |
anyenum | hashenum |
anyrange | hash_range |
jsonb | jsonb_hash |
(34 rows)
创建索引,例子
postgres=# create table test(id int, c1 int8, c2 int8, c3 int8);
CREATE TABLE
postgres=# create index idx_test_1 on test using bloom (c1,c2,c3);
CREATE INDEX
postgres=# set enable_seqscan=off;
SET
postgres=# explain select * from test where c1=1::int8 and c2=1::int8 and c3=1::int8;
QUERY PLAN
Bitmap Heap Scan on test (cost=15.73..15.75 rows=1 width=28)
Recheck Cond: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))
-> Bitmap Index Scan on idx_test_1 (cost=0.00..15.73 rows=1 width=0)
Index Cond: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))
(4 rows)
postgres=# set enable_seqscan =on;
SET
postgres=# explain select * from test where c1=1::int8 and c2=1::int8 and c3=1::int8;
QUERY PLAN
Seq Scan on test (cost=0.00..9.85 rows=1 width=28)
Filter: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))
(2 rows)
参考
https://obartunov.livejournal.com/201027.html
https://www.postgresql.org/docs/devel/static/xindex.html
转自阿里云德哥