【原创】hive关联hbase表后导致统计数据报错-阿里云开发者社区

【原创】hive关联hbase表后导致统计数据报错

2012-06-20 1051

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 环境说明：搭建好的hadoop+hbase+zookeeper集群，因为hbase里面查询数据不支持select语句，所以搭建起了hive（数据仓库）。我的hive搭建过程也不做太多的介绍，用的是第三方数据库mysql存储hive的元数据。

环境说明：

搭建好的hadoop+hbase+zookeeper集群，因为hbase里面查询数据不支持select语句，所以搭建起了hive（数据仓库）。我的hive搭建过程也不做太多的介绍，用的是第三方数据库mysql存储hive的元数据。在hive里面我把hbase数据库的xyz表和hive里面的hbase_table_1表关联上，然后执行select * from table可以查到数据，但是select count(*) from table死活报错，结果是mapreduce的任务没跑成功。截图如下：

先查看hbase数据库的xyz表的数据

hbase(main):001:0> scan 'xyz'
ROW COLUMN+CELL
10000 column=cf1:val, timestamp=1340091488116, value=China
1 row(s) in 0.6730 seconds

hbase(main):002:0>
其次查看hive中的hbase_table_1表的数据

hive> select * from hbase_table_1;
OK
10000 China
Time taken: 4.133 seconds
hive>

最后我在hive里要做统计多少行命令和报错信息

1、在hive的配置文件hive-site.xml里面增加如下内容，当然value里面的值根据你自己的实际情况来写

hive.aux.jars.path
file:///opt/hive/lib/hive-hbase-handler-0.8.1.jar,file:///opt/hive/lib/h
base-0.92.1.jar,file:///opt/hive/lib/zookeeper-3.3.1.jar

2、然后将namenode节点的hbase配置文件hbase-site.xml拷贝到hadoop的conf目录下，最后将你的它用rsync同步到所有的datanode节点上。

最后我们在查一下试试？

hive> select count(*) from hbase_table_1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201206190956_0003, Tracking URL = http://master:50030/jobdetails.jsp?jobid=job_201206190956_0003
Kill Command = /opt/hadoop/libexec/../bin/hadoop job -Dmapred.job.tracker=master:9002 -kill job_201206190956_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2012-06-20 12:04:12,499 Stage-1 map = 0%, reduce = 0%
2012-06-20 12:04:27,668 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:28,682 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:29,703 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:30,713 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:31,724 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:32,734 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:33,757 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:34,768 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:35,777 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:36,788 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:37,798 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:38,808 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:39,869 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:40,880 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:42,126 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:43,136 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:44,145 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:45,155 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:46,164 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:47,174 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:48,183 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.82 sec
2012-06-20 12:04:49,236 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:50,247 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:51,267 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:52,277 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:53,288 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:54,320 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:55,330 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:56,341 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:57,364 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
2012-06-20 12:04:58,375 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 7.48 sec
MapReduce Total cumulative CPU time: 7 seconds 480 msec
Ended Job = job_201206190956_0003
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Accumulative CPU: 7.48 sec HDFS Read: 240 HDFS Write: 2 SUCESS
Total MapReduce CPU Time Spent: 7 seconds 480 msec
OK
1
Time taken: 92.04 seconds
可以了！因为我的表中只有一行数据！用的虚拟机比较慢，哎！！！

【原创】hive关联hbase表后导致统计数据报错

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

【原创】hive关联hbase表后导致统计数据报错

热门文章

最新文章

相关课程

相关电子书

相关实验场景