问题:
11-30,压缩程度不同,压缩后的空间大小是否空间空间不同
11-30,压缩后,空间与压缩前的空间大小是否不同
不同的压缩率,数据空间一样
tutorial=> select pg_size_pretty(pg_total_relation_size('test_appendonly'));
pg_size_pretty
----------------
324 kB
(1 row)
tutorial=> select pg_size_pretty(pg_total_relation_size('test_appendonly_1'));
pg_size_pretty
----------------
324 kB
(1 row)
问题:为什么压缩了,空间反而变大了
tutorial=> create table t_ab as select generate_series(0,1000) a,'helloworld'::varchar(50) b distri
buted by (a);
tutorial=> select pg_size_pretty(pg_total_relation_size('t_ab'));
pg_size_pretty
----------------
160 kB
(1 row)
正题
对于数据仓库应用来说,由于数据量大,而且每次分析都是全表扫描,引入压缩表
压缩表 Appendonly表
1)压缩表的应用场景
业务上不需要对表进行更新和删除操作,用truncate+insert就可以实现业务逻辑
访问表基本上是全表扫描,不需要再表上建立索引
不能经常对表进行加字段或修改字段类型,对appendonly表添加字段比普通表慢
2)语法介绍
建表加上 with (appendonly=true)就可以指定为appendonly表,如果需要建立压缩表,则要加上(appendonly = true,compresslevel=5),其中compresslevel是压缩率,取值为1~9,一般选择5就足够了
appendonly 表特性
tutorial=> create table test_appendonly with (appendonly = true,compresslevel =5) as
select generate_series(0,1000) a,'helloworld'::varchar(50) b distributed by (a);SELECT 1001
tutorial=> create table test_appendonly_1 with (appendonly = true,compresslevel =1) as
select generate_series(0,1000) a,'helloworld'::varchar(50) b distributed by (a);SELECT 1001
tutorial=> select oid,oid::regclass from pg_class where relname ='test_appendonly' or relname like
'%87096%'; oid | oid
-------+-------------------------------
87098 | pg_toast.pg_toast_87096
87099 | pg_toast.pg_toast_87096_index
87096 | test_appendonly
87101 | pg_aoseg.pg_aoseg_87096
87102 | pg_aoseg.pg_aoseg_87096_index
(5 rows)
tutorial=> \d pg_toast.pg_toast_87096;
TOAST table "pg_toast.pg_toast_87096"
Column | Type
------------+---------
chunk_id | oid
chunk_seq | integer
chunk_data | bytea
tutorial=> \d pg_toast.pg_toast_87096_index;
Index "pg_toast.pg_toast_87096_index"
Column | Type | Definition
-----------+---------+------------
chunk_id | oid | chunk_id
chunk_seq | integer | chunk_seq
primary key, btree, for table "pg_toast.pg_toast_87096"
Tablespace: "ts_gh"
tutorial=> \d pg_toast.pg_aoseg.pg_aoseg_87096;
?o? "pg_aoseg.pg_aoseg_87096"
Column | Type
-----------------+------------------
segno | integer
eof | double precision
tupcount | double precision
varblockcount | double precision
eofuncompressed | double precision
tutorial=> \d pg_toast.pg_aoseg.pg_aoseg_87096_index;
Index "pg_aoseg.pg_aoseg_87096_index"
Column | Type | Definition
--------+---------+------------
segno | integer | segno
primary key, btree, for table "pg_aoseg.pg_aoseg_87096"
Tablespace: "ts_gh"
gp_dist_random查询子节点信息
get_ao_compression_ration:查询表的压缩率与gp_dist_random查询底层pg_aoseg前缀表一样
tutorial=# select get_ao_compression_ratio('test_appendonly');
get_ao_compression_ratio
--------------------------
6.28
(1 row)
tutorial=# select sum(eofuncompressed)/sum(eof) from gp_dist_random('pg_aoseg.pg_aoseg_87096')
; ?column?
------------------
6.28131021194605
(1 row)
get_ao_distribution:查询每个节点的数据量
tutorial=# select * from get_ao_distribution('test_appendonly') order by segmentid;
segmentid | tupcount
-----------+----------
0 | 501
1 | 500
(2 rows)