pg_repack bloat 处理测试初步

简介:

一、软件安装


1.软件需求:

postgresql-9.5.2.tar.gz

pg_repack-1.3.4.zip


2.安装pg_repack


[root@localhost pg_repack-1.3.4]# export PATH=/opt/pgsql/9.5.2/bin:$PATH

[root@localhost pg_repack-1.3.4]# export LD_LIBRARY_PATH=/opt/pgsql/9.5.2/lib

[root@localhost pg_repack-1.3.4]# export MANPATH=/opt/pgsql/9.5.2/share/man:$MANPATH

[root@localhost pg_repack-1.3.4]# make

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/bin'

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o pg_repack.o pg_repack.c

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o pgut/pgut.o pgut/pgut.c

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o pgut/pgut-fe.o pgut/pgut-fe.c

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 pg_repack.o pgut/pgut.o pgut/pgut-fe.o -L/opt/pgsql/9.5.2/lib -lpq -L/opt/pgsql/9.5.2/lib -Wl,--as-needed -Wl,-rpath,'/opt/pgsql/9.5.2/lib',--enable-new-dtags  -lpgcommon -lpgport -lz -lreadline -lrt -lcrypt -ldl -lm -o pg_repack

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/bin'

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/lib'

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o repack.o repack.c

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o pgut/pgut-be.o pgut/pgut-be.c

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE   -c -o pgut/pgut-spi.o pgut/pgut-spi.c

( echo '{ global:'; gawk '/^[^#]/ {printf "%s;\n",$1}' exports.txt; echo ' local: *; };' ) >exports.list

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -shared -Wl,--version-script=exports.list -o pg_repack.so repack.o pgut/pgut-be.o pgut/pgut-spi.o -L/opt/pgsql/9.5.2/lib -Wl,--as-needed -Wl,-rpath,'/opt/pgsql/9.5.2/lib',--enable-new-dtags  

sed 's,REPACK_VERSION,1.3.4,g' pg_repack.sql.in > pg_repack--1.3.4.sql;

sed 's,REPACK_VERSION,1.3.4,g' pg_repack.control.in > pg_repack.control

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/lib'

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/regress'

make[1]: Nothing to be done for `all'.

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/regress'

[root@localhost pg_repack-1.3.4]# make install

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/bin'

/bin/mkdir -p '/opt/pgsql/9.5.2/bin'

/usr/bin/install -c  pg_repack '/opt/pgsql/9.5.2/bin'

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/bin'

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/lib'

/bin/mkdir -p '/opt/pgsql/9.5.2/lib'

/bin/mkdir -p '/opt/pgsql/9.5.2/share/extension'

/bin/mkdir -p '/opt/pgsql/9.5.2/share/extension'

/usr/bin/install -c -m 755  pg_repack.so '/opt/pgsql/9.5.2/lib/pg_repack.so'

/usr/bin/install -c -m 644 .//pg_repack.control '/opt/pgsql/9.5.2/share/extension/'

/usr/bin/install -c -m 644  pg_repack--1.3.4.sql pg_repack.control '/opt/pgsql/9.5.2/share/extension/'

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/lib'

make[1]: Entering directory `/home/soft/pg_repack-1.3.4/regress'

make[1]: Nothing to be done for `install'.

make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/regress'

[root@localhost pg_repack-1.3.4]# 


3.创建初始环境


[postgres@localhost ~]$ createdb bloatdb

[postgres@localhost ~]$ psql -d bloatdb -c "create extension pgstattuple;"

CREATE EXTENSION

[postgres@localhost ~]$ psql -d bloatdb -c "CREATE EXTENSION pg_repack;"

CREATE EXTENSION

[postgres@localhost ~]$ 

$ psql bloatdb

psql (9.5.2)

Type "help" for help.


bloatdb=# \dx

                                   List of installed extensions

    Name     | Version |   Schema   |                         Description                          

-------------+---------+------------+--------------------------------------------------------------

 pg_repack   | 1.3.4   | public     | Reorganize tables in PostgreSQL databases with minimal locks

 pgstattuple | 1.3     | public     | show tuple-level statistics

 plpgsql     | 1.0     | pg_catalog | PL/pgSQL procedural language

(3 rows)



二、静态(无活跃交易)膨胀整理测试


1.处理表tbl指定索引

1).准备环境

bloatdb=# create table tbl(id int primary key, first varchar(20),second varchar(20));

CREATE TABLE

bloatdb=# create index idx_tbl_first on tbl (first);

CREATE INDEX

bloatdb=# create index idx_tbl_second on tbl (second);

CREATE INDEX

bloatdb=# SELECT count(*) FROM tbl;

 count 

-------

     0

(1 row)


bloatdb=# SELECT pg_size_pretty(pg_total_relation_size('tbl'));

 pg_size_pretty 

----------------

 24 kB

(1 row)


bloatdb=# INSERT INTO tbl VALUES(generate_series(1,10000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 10000

bloatdb=# SELECT count(*) FROM tbl;

 count 

-------

 10000

(1 row)


bloatdb=# SELECT pg_size_pretty(pg_total_relation_size('tbl'));

 pg_size_pretty 

----------------

 1584 kB

(1 row)


bloatdb=# 


更新列

bloatdb=# UPDATE tbl SET first= 'updated-001';

UPDATE 10000

bloatdb=# SELECT count(*) FROM tbl;

 count 

-------

 10000

(1 row)


bloatdb=# SELECT pg_size_pretty(pg_total_relation_size('tbl'));

 pg_size_pretty 

----------------

 3376 kB

(1 row)


bloatdb=# 


2).查询膨胀率

建立膨胀统计表

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" --create_stats_table


膨胀统计

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_second.......................................................(52.69%) 417 kB wasted

2. public.idx_tbl_first........................................................(52.64%) 413 kB wasted

3. public.tbl_pkey.............................................................(57.79%) 388 kB wasted

[postgres@localhost ~]$ 


3).处理膨胀

指定数据库的特定索引

[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_first

INFO: repacking index "public"."idx_tbl_first"

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl

1. public.idx_tbl_second.......................................................(52.69%) 417 kB wasted

2. public.tbl_pkey.............................................................(57.79%) 388 kB wasted

3. public.idx_tbl_first.....................................................(0.93%) 3121 bytes wasted

[postgres@localhost ~]$ 


2.处理表tbl所有索引

1).准备环境

bloatdb=# update tbl set second='chris';

UPDATE 10000

bloatdb=# SELECT count(*) FROM tbl;

 count 

-------

 10000

(1 row)


bloatdb=# SELECT pg_size_pretty(pg_total_relation_size('tbl'));

 pg_size_pretty 

----------------

 3600 kB

(1 row)


bloatdb=#

bloatdb=# update tbl set first='chris';

UPDATE 10000

bloatdb=# SELECT count(*) FROM tbl;

 count 

-------

 10000

(1 row)


bloatdb=# SELECT pg_size_pretty(pg_total_relation_size('tbl'));

 pg_size_pretty 

----------------

 4176 kB

(1 row)


bloatdb=# 

2).检查膨胀

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_second.......................................................(59.94%) 820 kB wasted

2. public.idx_tbl_first........................................................(40.94%) 409 kB wasted

3. public.tbl_pkey.............................................................(28.73%) 193 kB wasted

[postgres@localhost ~]$ 


3).处理tbl表所有索引膨胀

[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes

INFO: repacking indexes of "tbl"

INFO: repacking index "public"."idx_tbl_first"

INFO: repacking index "public"."idx_tbl_second"

INFO: repacking index "public"."tbl_pkey"

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_first.....................................................(1.23%) 3028 bytes wasted

2. public.idx_tbl_second....................................................(1.23%) 3028 bytes wasted

3. public.tbl_pkey..........................................................(1.23%) 3028 bytes wasted

[postgres@localhost ~]$ 


3.处理tbl数据和索引膨胀

1).索引膨胀

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl

1. public.idx_tbl_first.........................................................(57.87%) 49 MB wasted

2. public.idx_tbl_second........................................................(39.29%) 34 MB wasted

3. public.tbl_pkey..............................................................(51.22%) 26 MB wasted

 

2).处理膨胀online VACUUM FULL 数据库bloatdb表tbl(数据和索引)

[postgres@localhost ~]$ pg_repack --no-order --table tbl -d bloatdb

INFO: repacking table "tbl"

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl

1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted

2. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted

3. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted

[postgres@localhost ~]$ 


三、动态(有交易发生时)膨胀处理


1.整个表做膨胀处理

1).初始条件

-- clear table data

bloatdb=# select * from tbl;

 id | first | second 

----+-------+--------

(0 rows)


bloatdb=# 

bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 100000

bloatdb=# UPDATE tbl SET first= 'updated-001';

UPDATE 100000

bloatdb=# 

-- check bloat

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_second........................................................(67.26%) 17 MB wasted

2. public.idx_tbl_first.........................................................(67.46%) 17 MB wasted

3. public.tbl_pkey............................................................(63.91%) 9832 kB wasted

[postgres@localhost ~]$ 


2).大量插入数据同时做膨胀处理

statement_timeout=0, 视情况调整:maintenance_work_mem,wal_keep_segments(streaming,SSD<2000>)

先插入数据,过程中处理膨胀加上-T参数值为3600.

-- session 1:insert data

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

光标闪烁

-- session 2:repack during insert

$ pg_repack -d bloatdb --no-order --table tbl --wait-timeout=3600

INFO: repacking table "tbl"

光标闪烁

############################## args: -j ###########################################

如果使用--table指定多个table时,会依次处理每个指定的表。如果整理使用-j参数,则pg_repack,在创建临时表索引时会启动多个后台进程并行创建索引,一般每建立一个索引都需要启动一个后台进程,直到min(j,tbl_idx_number<表中总的索引数>)数量的worker被创建完成。当指定j数量小于索引数量时,一个索引创建完成时,空闲的work会自动被分派去建立剩余索引。当指定j数量大于索引数量时,一次性分派索引总数个work来执行索引创建任务。

$ pg_repack -j 10 --no-order -d bloatdb --table tbl --wait-timeout=3600

NOTICE: Setting up workers.conns

INFO: repacking table "tbl"

LOG: Initial worker 0 to build index: CREATE UNIQUE INDEX index_22025 ON repack.table_22022 USING btree (id)

LOG: Initial worker 1 to build index: CREATE INDEX index_22027 ON repack.table_22022 USING btree (first)

LOG: Initial worker 2 to build index: CREATE INDEX index_22028 ON repack.table_22022 USING btree (second)

LOG: Command finished in worker 0: CREATE UNIQUE INDEX index_22025 ON repack.table_22022 USING btree (id)

LOG: Command finished in worker 1: CREATE INDEX index_22027 ON repack.table_22022 USING btree (first)

LOG: Command finished in worker 2: CREATE INDEX index_22028 ON repack.table_22022 USING btree (second)

$

指定多个表的情况,j < idx_numbers

$ pg_repack -j 2 --no-order -d bloatdb --table tbl -t tbl01 --wait-timeout=3600

NOTICE: Setting up workers.conns

INFO: repacking table "tbl"

LOG: Initial worker 0 to build index: CREATE UNIQUE INDEX index_22025 ON repack.table_22022 USING btree (id)

LOG: Initial worker 1 to build index: CREATE INDEX index_22027 ON repack.table_22022 USING btree (first)

LOG: Command finished in worker 0: CREATE UNIQUE INDEX index_22025 ON repack.table_22022 USING btree (id)

LOG: Assigning worker 0 to build index #2: CREATE INDEX index_22028 ON repack.table_22022 USING btree (second)

LOG: Command finished in worker 1: CREATE INDEX index_22027 ON repack.table_22022 USING btree (first)

LOG: Command finished in worker 0: CREATE INDEX index_22028 ON repack.table_22022 USING btree (second)

(处理过程中有长事务,会等待事务完成)

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

NOTICE: Waiting for 1 transactions to finish. First PID: 10426

INFO: repacking table "tbl01"

LOG: Initial worker 0 to build index: CREATE UNIQUE INDEX index_22065 ON repack.table_22062 USING btree (id)

LOG: Initial worker 1 to build index: CREATE INDEX index_22067 ON repack.table_22062 USING btree (first)

LOG: Command finished in worker 0: CREATE UNIQUE INDEX index_22065 ON repack.table_22062 USING btree (id)

LOG: Assigning worker 0 to build index #2: CREATE INDEX index_22068 ON repack.table_22062 USING btree (second)

LOG: Command finished in worker 1: CREATE INDEX index_22067 ON repack.table_22062 USING btree (first)

LOG: Command finished in worker 0: CREATE INDEX index_22068 ON repack.table_22062 USING btree (second)


##################################################################################

--session 1 finish insert

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 2900000

bloatdb=# 


-- session 2: finish repack

[postgres@localhost ~]$ pg_repack -d bloatdb --no-order --table tbl --wait-timeout=3600

INFO: repacking table "tbl"

-- session 2:膨胀检查

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted

2. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted

3. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted

[postgres@localhost ~]$ 

-- session 1: 数据检查

bloatdb=# select count(*) from tbl ;

  count  

---------

 3000000

(1 row)


bloatdb=# 


2.指定tbl表所有索引膨胀处理

如果tbl表有多个索引情况下,默认处理方式,一个索引接着一个索引做膨胀处理即使指定了-j参数大于1。


1).准备数据


--session 1: insert data

bloatdb=# delete FROM tbl;

DELETE 3000000

bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 100000

bloatdb=# update tbl set first='chris';

UPDATE 100000

bloatdb=# 


-- session 2:check bloat


[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.tbl_pkey..............................................................(41.14%) 28 MB wasted

2. public.idx_tbl_second.......................................................(4.32%) 4471 kB wasted

3. public.idx_tbl_first........................................................(2.96%) 2889 kB wasted

[postgres@localhost ~]$


2).online insert and repack

--session 1: insert large data

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

光标闪烁


-- session 2:process bloat,during session 1 inert large data

[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes --wait-timeout=3600

INFO: repacking indexes of "tbl"

INFO: repacking index "public"."idx_tbl_first"

INFO: repacking index "public"."idx_tbl_second"

光标闪烁

--session 1:insert finish

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 2900000

bloatdb=# 

--session 2:repack finish

[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes -T 3600

INFO: repacking indexes of "tbl"

INFO: repacking index "public"."idx_tbl_first"

INFO: repacking index "public"."idx_tbl_second"

INFO: repacking index "public"."tbl_pkey"


3) check table data and index bloat

--session 2:check bloat

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted

2. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted

3. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted

[postgres@localhost ~]$ 

--session 1:check table data

bloatdb=# select count(*) from tbl;

  count  

---------

 3000000

(1 row)


bloatdb=# 



3.指定tbl表指定索引膨胀处理


注意:--index(默认使用concurrently方式创建指定索引),无法与--only-indexes选项同时使用。

[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_first --only-indexes 

ERROR: cannot specify --index (-i) and --only-indexes (-x)


1).准备数据

-- read data

bloatdb=# delete FROM tbl;

DELETE 3000000

bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 100000

bloatdb=# update tbl set first='chris';

UPDATE 100000

bloatdb=# 

-- check bloat

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_second........................................................(47.57%) 97 MB wasted

2. public.tbl_pkey.............................................................(9.44%) 7206 kB wasted

3. public.idx_tbl_first........................................................(3.11%) 3040 kB wasted

[postgres@localhost ~]$ 



2).online insert and repack

--session 1: insert large data

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

光标闪烁


-- session 2:process bloat,during session 1 inert large data

[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_second --wait-timeout=3600

INFO: repacking index "public"."idx_tbl_second"

光标闪烁


--session 1:insert finish

bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 2900000

bloatdb=# 

--session 2:repack finish

[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_second --wait-timeout=3600

INFO: repacking index "public"."idx_tbl_second"

[postgres@localhost ~]$ 


3) check table data and index bloat

--session 2:check bloat

[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl 

1. public.idx_tbl_first........................................................(50.77%) 102 MB wasted

2. public.tbl_pkey...............................................................(47.6%) 65 MB wasted

3. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted

[postgres@localhost ~]$ 

--session 1:check table data

bloatdb=# select count(*) from tbl;

  count  

---------

 3000000

(1 row)


bloatdb=# 

 


测试结论:

  1. 一般同等条件下,索引比数据更容易膨胀。

  2. 在磁盘空间较紧张的情况下,建议一条接着一条索引处理。

  3. 一般bloat处理所需磁盘空闲空间是对象size的2倍,所以处理前必须先关注空闲磁盘空间大小。

  4. 注意pg_repack版本对Pg版本的支持情况,9.6截至2016-11-26仍未支持,详见http://pgxn.org/dist/pg_repack/doc/pg_repack.html#Releases

  5. 处理存在在线交易的表或者索引对象的bloat时,注意设置超时参数--wait-timeout,一般设置为1800或3600(特别感谢李海龙建议)。

特别声明:本说明只针对此次测试环境,在生产环境要在业务低峰时期运行,为了保证系统数据安全,建议先备份数据,然后做膨胀处理





本文转自 pgmia 51CTO博客,原文链接:http://blog.51cto.com/heyiyi/1876843
相关文章
|
JavaScript C语言 iOS开发
函数 table.unpack
函数 table.unpack
373 0
|
SQL 算法 Go
DBPack SQL Tracing 功能及数据加密功能详解
在 v0.1.0 版本我们发布了分布式事务功能,支持各种编程语言协调分布式事务。 在 v0.2.0 版本我们发布了读写分离功能,用户在开启读写分离功能的情况下,使用分布式事务协调功能不再需要做复杂的集成,DBPack 提供了一站式的解决方案。 在 v0.3.0 版本,我们加入 SQL Tracing 的功能,使用该功能可以收集到一个完整的分布式事务链路,查看事务的执行情况。我们还加入了数据加密功能,通过该功能保护用户的重要数据资产。
118 0
DBPack SQL Tracing 功能及数据加密功能详解
|
SQL 存储 关系型数据库
使用强大的DBPack处理分布式事务(PHP使用教程)
新兴的AT事务解决方案,例如Seata和Seata-golang,通过数据源代理层的资源管理器RM记录SQL回滚日志,跟随本地事务一起提交,大幅减少了数据的锁定时间,性能好且对业务几乎没有侵入。其缺点是支持的语言比较单一,例如Seata只支持Java语言类型的微服务,Seata-golang只支持Go语言类型的微服务。为了突破AT事务对业务编程语言的限制,现在业界正在往DB Mesh的方向发展,通过将事务中间件部署在SideCar的方式,达到任何编程语言都能使用分布式事务中间件的效果。
198 0
使用强大的DBPack处理分布式事务(PHP使用教程)
|
SQL 存储 Kubernetes
中国电子云 DBMesh 项目 DBPack 的实践
2022 年 4 月,中国电子云开源了其云原生数据库 Mesh 项目 DBPack。该项目的诞生,旨在解决用户上云过程中面临的一些技术难点,诸如分布式事务、分库分表等。由于它数据库 Mesh 的定位,意味着它可以支持任意微服务编程语言。
170 0
中国电子云 DBMesh 项目 DBPack 的实践
|
SQL 存储 分布式计算
【spark系列4】spark 3.0.1集成delta 0.7.0原理解析--delta自定义sql
【spark系列4】spark 3.0.1集成delta 0.7.0原理解析--delta自定义sql
256 0
|
分布式计算 分布式数据库 Spark
X-Pack Spark使用[FAQ]
概述 本文主要列出在使用X-Pack Spark的FAQ。 Spark Connectors 主要列举Spark 对接其它数据源遇到的问题 Spark on HBase Spark on HBase Connector:如何在Spark侧设置HBase参数。
2264 0
X-Pack Spark用户手册
概述 Spark是大数据平台的通用计算平台,应用非常广泛。本文主要介绍Spark相关的知识,主要包括:了解Spark,使用Spark,使用Spark过程中遇到的问题FAQ等,谨帮助用户快速的掌握Spark以及如何使用Spark。
3259 0
|
SQL 监控 关系型数据库
|
SQL Oracle 关系型数据库