11g compression 新特性(1)

简介:
11g引入了大量compress相关的特性,其中之一便是dbms_compression包;GET_COMPRESSION_RATIO函数可以帮助我们了解压缩某个表后各种可能的影响。换而言之,这个函数可以让我们在具体实施表压缩技术或者测试前,对于压缩后的效果能有一个基本的印象。该包在11gr2中被首次引入,故而使用之前版本的包括11gr1都无缘得用。其次除OLTP压缩模式之外的柱形混合压缩只能在基于Exdata存储的表空间上实现。使用DBMS_COMPRESSION包获取的相关压缩信息是十分准确的,因为在评估过程中Oracle通过实际采样并建立模型表以尽可能还原逼真的数据。 我们可以通过trace来分析其评估过程中的具体操作,可以分成2步: 1. 建立原表的样本表,其采样值基于原表的大小:
?
1
2
3
SQL>  create  table  samp_dss_nation tablespace SCRATCH  as  select  from  dss_nation sample block (50);
 
Table  created.
2. 基于采用表建立对应压缩类型的模型表:
?
1
2
3
4
5
6
SQL>  create  table  model_dss_nation tablespace SCRATCH compress  for  query high  as  select  from  samp_dss_nation;
create  table  model_dss_nation tablespace SCRATCH compress  for  query high  as  select  from  samp_dss_nation
*
ERROR  at  line 1:
ORA-64307: hybrid columnar compression  is  only  supported  in  tablespaces
residing  on  Exadata storage
可以看到在实际建立过程中Oracle将拒绝在非Exdata存储的表空间上建立该类柱形混合压缩(包括:COMP_FOR_QUERY_HIGH,COMP_FOR_QUERY_LOW,COMP_FOR_ARCHIVE_HIGH,CO MP_FOR_ARCHIVE_LOW)。但DBMS_COMPRESSION在进行评估时可以绕过Oracle对于该类操作的LOCK. 要在没有Exdata存储设备的情况下使用dbms_compression包评测OLTP压缩模式外的柱状混合压缩模式时 (hybrid columnar compression is only supported in tablespaces residing on Exadata storage),首先需要打上patch 8896202:
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[oracle@rh2 admin]$ /s01/dbhome_1/OPatch/opatch lsinventory
Invoking OPatch 11.1.0.6.6
 
Oracle Interim Patch Installer version 11.1.0.6.6
Copyright (c) 2009, Oracle Corporation.   All  rights reserved.
 
Oracle Home       : /s01/dbhome_1
Central Inventory : /s01/oraInventory
from            : /etc/oraInst.loc
OPatch version    : 11.1.0.6.6
OUI version       : 11.2.0.1.0
OUI location      : /s01/dbhome_1/oui
Log file location : /s01/dbhome_1/cfgtoollogs/opatch/opatch2010-06-02_23-08-33PM.log
 
Patch history file: /s01/dbhome_1/cfgtoollogs/opatch/opatch_history.txt
 
Lsinventory  Output  file location : /s01/dbhome_1/cfgtoollogs/opatch/lsinv/lsinventory2010-06-02_23-08-33PM.txt
 
--------------------------------------------------------------------------------
Installed  Top - level  Products (1):
 
Oracle  Database  11g                                                  11.2.0.1.0
There are 1 products installed  in  this Oracle Home.
 
Interim patches (1) :
 
Patch  8896202      : applied  on  Wed Jun 02 21:55:44 CST 2010
Unique  Patch ID:  11909460
Created  on  29 Oct 2009, 15:21:45 hrs US/Pacific
Bugs fixed:
8896202
 
该patch用以:ENABLE COMPRESSION ADVISOR  TO  ESTIMATE EXADATA HCC COMPRESSION RATIOS
接着我们还需要运行被修改后的DBMSCOMP包创建SQL,具体操作为:
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
SQL> @?/rdbms/admin/prvtcmpr.plb
 
Package created.
 
Grant  succeeded.
 
Package body created.
 
No  errors.
 
Package body created.
 
No  errors.
 
Type body created.
 
No  errors.
SQL> @?/rdbms/admin/dbmscomp.sql
 
Package created.
 
Synonym created.
 
Grant  succeeded.
 
No  errors.
 
DBMS_COMPRESSION包在对表压缩进行评估时,默认表最少数据为1000000行,可能在你的测试库中没有这么多数据,我们可以修改这个下限;
 
通过将COMP_RATIO_MINROWS常数修改为1后,就可以分析最小为1行的表了:
 
SQL> create  or  replace  package sys.dbms_compression authid  current_user  is
 
   COMP_NOCOMPRESS       CONSTANT NUMBER := 1;
   COMP_FOR_OLTP         CONSTANT NUMBER := 2;
   COMP_FOR_QUERY_HIGH   CONSTANT NUMBER := 4;
   COMP_FOR_QUERY_LOW    CONSTANT NUMBER := 8;
   COMP_FOR_ARCHIVE_HIGH CONSTANT NUMBER := 16;
   COMP_FOR_ARCHIVE_LOW  CONSTANT NUMBER := 32;
 
   COMP_RATIO_MINROWS CONSTANT NUMBER := 10;
   COMP_RATIO_ALLROWS CONSTANT NUMBER := -1;
 
   PROCEDURE  get_compression_ratio(scratchtbsname  IN  varchar2,
                                   ownname         IN  varchar2,
                                   tabname         IN  varchar2,
                                   partname        IN  varchar2,
                                   comptype        IN  number,
                                   blkcnt_cmp      OUT  PLS_INTEGER,
                                   blkcnt_uncmp    OUT  PLS_INTEGER,
                                   row_cmp         OUT  PLS_INTEGER,
                                   row_uncmp       OUT  PLS_INTEGER,
                                   cmp_ratio       OUT  NUMBER,
                                   comptype_str    OUT  varchar2,
                                   subset_numrows  IN  number  DEFAULT  COMP_RATIO_MINROWS);
 
   function  get_compression_type(ownname  IN  varchar2,
                                 tabname  IN  varchar2,
                                 row_id   IN  rowid)  return  number;
 
   PROCEDURE  incremental_compress(ownname          IN  dba_objects.owner%type,
                                  tabname          IN  dba_objects.object_name%type,
                                  partname         IN  dba_objects.subobject_name%type,
                                  colname          IN  varchar2,
                                  dump_on          IN  number  default  0,
                                  autocompress_on  IN  number  default  0,
                                  where_clause     IN  varchar2  default  '' );
 
end  dbms_compression;
 
Package created.
 
SQL>  alter  package dbms_compression compile body;
 
Package body altered.
 
接下来我们通过建立一个基于TPC-D的测试的 Schema ,保证各表上有较多的数据,并且数据有一定的拟真度:
 
SQL>  select  table_name,num_rows,blocks  from  user_tables ;
 
TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
DSS_SUPPLIER                        20000        496
DSS_PART                           400000       7552
DSS_REGION                              5          5
DSS_PARTSUPP                      1600000      29349
DSS_LINEITEM                     12000000     221376
DSS_ORDER                         3000000      48601
DSS_CUSTOMER                       300000       6922
DSS_NATION                             25          5
 
现在可以进行压缩评估了,我们针对测试模型 Schema 编辑以下匿名块并运行
 
SQL>  set  serveroutput  on ;
SQL>  declare
   cmp_blk_cnt   binary_integer;
   uncmp_blk_cnt binary_integer;
   cmp_rows      binary_integer;
   uncmp_rows    binary_integer;
   cmp_ratio     number;
   cmp_typ       varchar2(100);
BEGIN
   for  in  ( SELECT  TABLE_NAME
               from  dba_tables
              where  compression =  'DISABLED'
                and  owner =  'MACLEAN'  and  num_rows>1000000) loop
     for  in  1 .. 5 loop
       dbms_compression.get_compression_ratio(scratchtbsname =>  'SCRATCH' ,
                                              ownname        =>  'MACLEAN' ,
                                              tabname        => i.table_name,
                                              partname       =>  NULL ,
                                              comptype       => power(2, j),
                                              blkcnt_cmp     => cmp_blk_cnt,
                                              blkcnt_uncmp   => uncmp_blk_cnt,
                                              row_cmp        => cmp_rows,
                                              row_uncmp      => uncmp_rows,
                                              cmp_ratio      => cmp_ratio,
                                              comptype_str   => cmp_typ);
       dbms_output.put_line(i.table_name ||  '--'  ||  'compress_type is '  ||
                            cmp_typ ||  ' ratio :'  ||
                            to_char(cmp_ratio,  '99.9' ) ||  '%' );
 
     end  loop;
   end  loop;
end ;
/
DSS_ORDER --compress_type is "Compress For OLTP" ratio :  1.1%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_ORDER --compress_type is "Compress For Query High" ratio :  2.7%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_ORDER --compress_type is "Compress For Query Low" ratio :  1.7%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_ORDER --compress_type is "Compress For Archive High" ratio :  2.9%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_ORDER --compress_type is "Compress For Archive Low" ratio :  2.7%
DSS_PARTSUPP --compress_type is "Compress For OLTP" ratio :   .9%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_PARTSUPP --compress_type is "Compress For Query High" ratio :  1.8%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_PARTSUPP --compress_type is "Compress For Query Low" ratio :  1.2%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_PARTSUPP --compress_type is "Compress For Archive High" ratio :  1.9%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_PARTSUPP --compress_type is "Compress For Archive Low" ratio :  1.8%
DSS_LINEITEM --compress_type is "Compress For OLTP" ratio :  1.4%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_LINEITEM --compress_type is "Compress For Query High" ratio :  3.5%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_LINEITEM --compress_type is "Compress For Query Low" ratio :  2.3%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_LINEITEM --compress_type is "Compress For Archive High" ratio :  4.3%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
DSS_LINEITEM --compress_type is "Compress For Archive Low" ratio :  3.7%
 
PL/SQL  procedure  successfully completed.
可以从上述测试看到,"Compress For Archive High"压缩率最高,该类型最适合于数据归档存储,但其算法复杂度高于"Compress For Archive Low",压缩耗时亦随之上升。 总体压缩率都较低,这同TPC-D测试的数据建模有一定关联,我们再使用一组TPC-H的测试数据来模拟压缩:
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
SQL> conn liu/liu;
Connected.
 
SQL>  select  num_rows,blocks,table_name  from  user_tables;
 
   NUM_ROWS     BLOCKS TABLE_NAME
---------- ---------- ------------------------------
    3000000      46817 H_ORDER
     300000       6040 H_CUSTOMER
   12000000     221376 H_LINEITEM
         25          5 H_NATION
     400000       7552 H_PART
          5          5 H_REGION
    1600000      17491 H_PARTSUPP
      20000        496 H_SUPPLIER
 
rows  selected.
 
SQL>  set  serveroutput  on ;
SQL>  declare
   cmp_blk_cnt   binary_integer;
   uncmp_blk_cnt binary_integer;
   cmp_rows      binary_integer;
   uncmp_rows    binary_integer;
   cmp_ratio     number;
   cmp_typ       varchar2(100);
BEGIN
   for  in  ( SELECT  TABLE_NAME
               from  dba_tables
              where  compression =  'DISABLED'
                and  owner =  'LIU'  and  num_rows>1000000) loop
     for  in  1 .. 5 loop
       dbms_compression.get_compression_ratio(scratchtbsname =>  'SCRATCH' ,
                                              ownname        =>  'LIU' ,
                                              tabname        => i.table_name,
                                              partname       =>  NULL ,
                                              comptype       => power(2, j),
                                              blkcnt_cmp     => cmp_blk_cnt,
                                              blkcnt_uncmp   => uncmp_blk_cnt,
                                              row_cmp        => cmp_rows,
                                              row_uncmp      => uncmp_rows,
                                              cmp_ratio      => cmp_ratio,
                                              comptype_str   => cmp_typ);
       dbms_output.put_line(i.table_name ||  '--'  ||  'compress_type is '  ||
                            cmp_typ ||  ' ratio :'  ||
                            to_char(cmp_ratio,  '99.9' ) ||  '%' );
 
     end  loop;
   end  loop;
end ;
/
H_ORDER --compress_type is "Compress For OLTP" ratio :  1.1%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_ORDER --compress_type is "Compress For Query High" ratio :  5.2%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_ORDER --compress_type is "Compress For Query Low" ratio :  2.9%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_ORDER --compress_type is "Compress For Archive High" ratio :  7.2%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_ORDER --compress_type is "Compress For Archive Low" ratio :  5.5%
H_PARTSUPP --compress_type is "Compress For OLTP" ratio :   .9%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_PARTSUPP --compress_type is "Compress For Query High" ratio :  5.1%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_PARTSUPP --compress_type is "Compress For Query Low" ratio :  2.7%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_PARTSUPP --compress_type is "Compress For Archive High" ratio :  7.2%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_PARTSUPP --compress_type is "Compress For Archive Low" ratio :  5.3%
H_LINEITEM --compress_type is "Compress For OLTP" ratio :  1.4%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_LINEITEM --compress_type is "Compress For Query High" ratio :  5.2%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_LINEITEM --compress_type is "Compress For Query Low" ratio :  3.0%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_LINEITEM --compress_type is "Compress For Archive High" ratio :  7.4%
Compression Advisor self- check  validation successful.  select  count (*)  on  both
Uncompressed  and  EHCC Compressed format = 1000001  rows
H_LINEITEM --compress_type is "Compress For Archive Low" ratio :  5.6%
 
PL/SQL  procedure  successfully completed.

可以看到相比TPC-D的测试用数据,TPC-H建立的数据更具可压缩性。 PS: TPC-D represents a broad range of decision support (DS) applications that require complex, long running queries against large complex data structures. Real-world business questions were written against this model, resulting in 17 complex queries. The TPC Benchmark



本文转自maclean_007 51CTO博客,原文链接:http://blog.51cto.com/maclean/1277115


相关文章
|
7月前
|
算法 数据处理 C++
【C++ 20 新特性 算法和迭代器库的扩展和泛化 Ranges】深入浅出C++ Ranges库 (Exploring the C++ Ranges Library)
【C++ 20 新特性 算法和迭代器库的扩展和泛化 Ranges】深入浅出C++ Ranges库 (Exploring the C++ Ranges Library)
808 1
|
9天前
|
前端开发
ES6 中常用的新特性
ES6 中常用的新特性
|
4月前
|
JavaScript 前端开发
ES6新特性(一)
ES6新特性(一)
|
7月前
|
JavaScript 前端开发 开发者
es6的新特性
es6的新特性
55 7
|
存储 JavaScript 前端开发
ES6新特性
ES6新增特性详细内容
|
JavaScript 前端开发 网络架构
关于ES6新特性的总结 1
关于ES6新特性的总结
54 0
|
存储 算法
路径压缩 (Path Compression)
路径压缩 (Path Compression) 是一种用于求解最短路径问题的算法,通常用于 Dijkstra 算法中,可以加速求解最短路径问题。 路径压缩通过将已经确定的最短路径信息传递给未确定最短路径的节点,来加速最短路径的计算。具体来说,当一个节点的最短路径已经确定时,它会将这个信息传递给所有它的邻居节点,这样邻居节点就可以跳过一些不必要的计算,直接使用已经确定的最短路径信息,从而加速整个最短路径的计算过程。
338 3
|
7月前
|
前端开发
ES6 部分新特性使用
ES6 部分新特性使用
36 0
|
前端开发
关于ES6新特性的总结 2
关于ES6新特性的总结
41 0
|
前端开发 JavaScript
es6的8条新特性总结
es6的8条新特性总结
134 0