如何批量向hbase中插入数据
目前hadoop社区有一套bulkload到hbase的工具,原理是使用mr或者spark并行的生成hfile存储在hdfs,然后调用hbase的bulkload直接把这些hfile加载到hbase表。代码参考:
val hConf = HBaseConfiguration.create()
hConf.addResource('hbase-site.xml')
val hTableName = 'test_log'
hConf.set('hbase.mapreduce.hfileoutputformat.table.name', hTableName)
val tableName = TableName.valueOf(hTableName)
val conn = ConnectionFactory.createConnection(hConf)
val table = conn.getTable(tableName)
val regionLocator = conn.getRegionLocator(tableName)
val hFileOutput = '/tmp/h_file'
output.saveAsNewAPIHadoopFile(hFileOutput,
classOf[ImmutableBytesWritable],
classOf[KeyValue],
classOf[HFileOutputFormat2],
hConf
)
val bulkLoader = new LoadIncrementalHFiles(hConf)
bulkLoader.doBulkLoad(new Path(hFileOutput), conn.getAdmin, table, regionLocator)
赞0
踩0