一、HBase表region预分区的几种方式
第一种:创建5个region分区表
create_namespace 'track' create 'track:stu', 'info', SPLITS => ['10', '20', '30', '40']
第二种:split强制拆分
split 'tableName', 'splitKey' split 'track:stu', '50'
第三种:将拆分的key值放在文件中
create 'track:stu_1', 'info', SPLITS_FILE => '/opt/datas/splits.txt'
第四种:
create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
二、HBase表数据的压缩配置
首先Hadoop安装snappy,详细请见博客:
https://blog.csdn.net/weixin_45366499/article/details/109271630
查看hbase是否支持压缩
bin/hbase --config ~/conf_hbase org.apache.hadoop.util.NativeLibraryChecker
Native library checking: hadoop: true /opt/modules/hadoop-2.6.0-cdh5.9.3/lib/native/libhadoop.so.1.0.0 zlib: true /lib64/libz.so.1 snappy: true /opt/modules/hadoop-2.6.0-cdh5.9.3/lib/native/libsnappy.so.1 lz4: true revision:99 bzip2: false openssl: true /lib64/libcrypto.so
配置HBase表数据的压缩
ln -s /opt/modules/hadoop/lib/native /opt/modules/hbase/lib/native/Linux-amd64-64
创建HBase表
create 'stu_snappy','info'
修改表的属性为snappy压缩
alter "stu_snappy",{NAME => 'info',COMPRESSION => 'SNAPPY'}
可以使用CompressionTest工具来验证snappy的压缩器可用于HBase:
bin/hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://bigdata-pro-m01:9000/user/caizhengjie/datas/snappy snappy
三、HBase与Sqoop集成
通常来说HBase与Sqoop集成,是将mysql的数据导入到HBase中
mysql -> HBase
第一步:在sqoop的sqoop-env.sh文件里添加
export HBASE_HOME=/opt/modules/hbase
第二步:创建MySQL数据表并创建数据
CREATE TABLE user_info_hbase ( id varchar(20) DEFAULT NULL, username varchar(20) DEFAULT NULL, address varchar(20) DEFAULT NULL )
insert into user_info_hbase values('0001','admin','admin'); insert into user_info_hbase values('0002','wang','111111'); insert into user_info_hbase values('0003','zhang','000000'); insert into user_info_hbase values('0004','lili','000000'); insert into user_info_hbase values('0005','henry','000000'); insert into user_info_hbase values('0006','cherry','000000');
第三步:创建hbase表
create 'user_info','info'
第四步:可以查看HBase与Sqoop集成之间的参数
HBase arguments: --column-family <family> Sets the target column family for the import --hbase-bulkload Enables HBase bulk loading --hbase-create-table If specified, create missing HBase tables --hbase-row-key <col> Specifies which input column to use as the row key --hbase-table <table> Import to <table> in HBase
第五步:通过sqoop导入数据
bin/sqoop import \ --connect jdbc:mysql://bigdata-pro-m01:3306/db_sqoop \ --username root \ --password 199911 \ --table user_info_hbase \ --column-family info \ --hbase-bulkload \ --hbase-row-key id -m 1 \ --hbase-table user_info
第六步:查看运行结果
hbase(main):012:0> scan 'user_info' ROW COLUMN+CELL 0001 column=info:address, timestamp=1605251319678, value=admin 0001 column=info:username, timestamp=1605251319678, value=admin 0002 column=info:address, timestamp=1605251319678, value=111111 0002 column=info:username, timestamp=1605251319678, value=wang 0003 column=info:address, timestamp=1605251319678, value=000000 0003 column=info:username, timestamp=1605251319678, value=zhang 0004 column=info:address, timestamp=1605251319678, value=000000 0004 column=info:username, timestamp=1605251319678, value=lili 0005 column=info:address, timestamp=1605251319678, value=000000 0005 column=info:username, timestamp=1605251319678, value=henry 0006 column=info:address, timestamp=1605251319678, value=000000 0006 column=info:username, timestamp=1605251319678, value=cherry 6 row(s) in 0.3670 seconds