SparkSQL读写HiveOnHBase表
SparkSQL本身是支持StorageHandler,需要提供相关jar包。访问HiveOnHBase需要如下jar包:/usr/lib/hbase-current/lib/hbase-server-1.1.1.jar/usr/lib/hbase-current/lib/hbase-common-1.1.1.jar/usr/lib/hbase-current/lib/hbase-client-1.1.1.jar/usr/lib/hbase-current/lib/hbase-protocol-1.1.1.jar/usr/lib/hive-current/lib/hive-hbase-handler-2.3.3.jar需要将上述jar包添加到spark,有两种方式:a)通过--jars参数来添加如:spark-sql --jars /usr/lib/hbase-current/lib/hbase-server-1.1.1.jar,/usr/lib/hbase-current/lib/hbase-common-1.1.1.jar,/usr/lib/hbase-current/lib/hbase-client-1.1.1.jar,/usr/lib/hbase-current/lib/hbase-protocol-1.1.1.jar,/usr/lib/hive-current/lib/hive-hbase-handler-2.3.3.jarb)spark-defaults.conf里面配置spark.executor.extraClassPath /opt/apps/extra-jars/*:/usr/lib/hbase-current/lib/hbase-server-1.1.1.jar:/usr/lib/hive-current/lib/hive-hbase-handler-2.3.3.jar:/usr/lib/hbase-current/lib/hbase-common-1.1.1.jar:/usr/lib/hbase-current/lib/hbase-client-1.1.1.jar:/usr/lib/hbase-current/lib/hbase-protocol-1.1.1.jarspark.driver.extraClassPath /opt/apps/extra-jars/*:/usr/lib/hbase-current/lib/hbase-server-1.1.1.jar:/usr/lib/hive-current/lib/hive-hbase-handler-2.3.3.jar:/usr/lib/hbase-current/lib/hbase-common-1.1.1.jar:/usr/lib/hbase-current/lib/hbase-client-1.1.1.jar:/usr/lib/hbase-current/lib/hbase-protocol-1.1.1.jar备注:EMR-3.13.0以及以下版本,使用SparkSQL insert 数据到HiveOnHBase表的时候会出异常:java.lang.ClassCastException: org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to org.apache.hadoop.hive.ql.io.HiveOutputFormat
赞1
踩0