Hbase支持的压缩格式:
hbase支持的压缩格式:GZ(GZIP),LZ0,LZ4,Snappy
GZ:用于冷数据压缩,与Snappy和LZ0相比,GZIP的压缩率更高,但是更消耗CPU,解压/压缩速度更慢。
Snappy和LZ0:用于热数据压缩,占用CPU少,解压/压缩速度比GZ快,但是压缩率不如GZ高。
Snappy与LZ0相比,Snappy整体性能优于LZ0,Snappy压缩率比LZ0更低,但是解压/压缩速度更快。
LZ4与LZ0相比,LZ4的压缩率和LZ0的压缩率相差不多,但是LZ4的解压/压缩速度更快。
多数情况下,选择Snppy或LZ0是比较好的选择,因为它们的压缩开销底,能节省空间。
建表时指定压缩格式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
hbase(main):013:0> create
'test3'
,{NAME=>
'f1'
},{NAME=>
'f2'
,COMPRESSION=>
'Snappy'
}
0 row(s)
in
1.2740 seconds
=> Hbase::Table - test3
hbase(main):014:0> desc
'test3'
Table test3 is ENABLED
test3
COLUMN FAMILIES DESCRIPTION
{NAME =>
'f1'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
{NAME =>
'f2'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'SNAPPY'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
2 row(s)
in
0.0300 seconds
hbase(main):002:0> create
'test4'
,{NAME=>
'f1'
},{NAME=>
'f2'
,COMPRESSION=>
'GZ'
}
0 row(s)
in
1.4900 seconds
=> Hbase::Table - test4
hbase(main):003:0> desc
'test4'
Table test4 is ENABLED
test4
COLUMN FAMILIES DESCRIPTION
{NAME =>
'f1'
, BLOOMFILTER =>
'ROW'
, VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
, DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
, BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
{NAME =>
'f2'
, BLOOMFILTER =>
'ROW'
, VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
, DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'GZ'
, MIN_VERSIONS =>
'0'
, BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
2 row(s)
in
0.1290 seconds
|
建表后修改columnfamily压缩格式
正确做法是先disable表,再修改列族压缩格式,enbale表后做major_compact操作。
如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
hbase(main):004:0> desc
'test1'
Table test1 is ENABLED
test1
COLUMN FAMILIES DESCRIPTION
{NAME =>
'f1'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
{NAME =>
'f2'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
2 row(s)
in
0.0230 seconds
hbase(main):005:0> disable
'test1'
0 row(s)
in
2.2870 seconds
hbase(main):006:0> alter
'test1'
,{NAME=>
'f1'
,COMPRESSION=>
'Snappy'
}
Updating all regions with the new schema...
1
/1
regions updated.
Done.
0 row(s)
in
1.9510 seconds
hbase(main):007:0>
enable
'test1'
0 row(s)
in
1.2820 seconds
hbase(main):008:0> desc
'test1'
Table test1 is ENABLED
test1
COLUMN FAMILIES DESCRIPTION
{NAME =>
'f1'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'SNAPPY'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
{NAME =>
'f2'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
2 row(s)
in
0.0310 seconds
hbase(main):009:0> major_compact
'test1'
0 row(s)
in
0.1380 seconds
hbase(main):010:0> desc
'test1'
Table test1 is ENABLED
test1
COLUMN FAMILIES DESCRIPTION
{NAME =>
'f1'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'SNAPPY'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
{NAME =>
'f2'
, BLOOMFILTER =>
'ROW'
,VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
,DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRESS
ION =>
'NONE'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
2 row(s)
in
0.0260 seconds
|
但是没有disable表,也不做major_compact,列族压缩格式也修改成功了(暂时不知道原因)。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
hbase(main):001:0> desc
'test'
Table
test
is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME =>
'fam1'
, BLOOMFILTER =>
'ROW'
, VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
, DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRE
SSION =>
'NONE'
, MIN_VERSIONS =>
'0'
, BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
1 row(s)
in
0.3680 seconds
hbase(main):002:0> alter
'test'
,{NAME=>
'fam1'
,COMPRESSION=>
'LZ4'
}
Updating all regions with the new schema...
1
/1
regions updated.
Done.
0 row(s)
in
2.0460 seconds
hbase(main):003:0> desc
'test'
Table
test
is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME =>
'fam1'
, BLOOMFILTER =>
'ROW'
, VERSIONS =>
'1'
, IN_MEMORY =>
'false'
, KEEP_DELETED_CELLS =>
'FALSE'
, DATA_BLOCK_ENCODING =>
'NONE'
, TTL =>
'FOREVER'
, COMPRE
SSION =>
'LZ4'
, MIN_VERSIONS =>
'0'
,BLOCKCACHE =>
'true'
, BLOCKSIZE =>
'65536'
, REPLICATION_SCOPE =>
'0'
}
1 row(s)
in
0.0280 seconds
|
本文转自 天黑顺路 51CTO博客,原文链接:http://blog.51cto.com/mjal01/1963644,如需转载请自行联系原作者