spark写入hbase 报错,我本地测试过没问题,但线上的我发现 数据存进入了大部分,后来不知道为什么spark跑到180/200 task任务后就开始报 hbase连接异常了
2018-10-11 17:44:49 INFO ClientCnxn:852 - Socket connection established to 10.0.11.184/10.0.11.184:2181, initiating session
2018-10-11 17:44:49 INFO ClientCnxn:1235 - Session establishment complete on server 10.0.11.184/10.0.11.184:2181, sessionid = 0x36445aa7b202308, negotiated timeout = 90000
2018-10-11 17:44:50 INFO SparkHadoopMapRedUtil:54 - No need to commit output of task because needsTaskCommit=false: attempt_20181011173732_0013_m_000178_0
2018-10-11 17:44:50 INFO Executor:54 - Finished task 178.0 in stage 1.0 (TID 180). 2428 bytes result sent to driver
2018-10-11 17:44:50 INFO CoarseGrainedExecutorBackend:54 - Got assigned task 181
2018-10-11 17:44:50 INFO Executor:54 - Running task 179.0 in stage 1.0 (TID 181)
2018-10-11 17:44:50 INFO ShuffleBlockFetcherIterator:54 - Getting 1 non-empty blocks out of 2 blocks
2018-10-11 17:44:50 INFO ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 0 ms
2018-10-11 17:44:50 INFO FileOutputCommitter:108 - File Output Committer Algorithm version is 1
2018-10-11 17:44:50 INFO RecoverableZooKeeper:120 - Process identifier=hconnection-0x2f9aef63 connecting to ZooKeeper ensemble=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181
2018-10-11 17:44:50 INFO ZooKeeper:438 - Initiating client connection, connectString=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181 sessionTimeout=90000 watcher=hconnection-0x2f9aef630x0, quorum=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181, baseZNode=/hbase
2018-10-11 17:44:50 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.183/10.0.11.183:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:50 INFO ClientCnxn:852 - Socket connection established to 10.0.11.183/10.0.11.183:2181, initiating session
2018-10-11 17:44:50 WARN ClientCnxn:1102 - Session 0x0 for server 10.0.11.183/10.0.11.183:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-10-11 17:44:50 WARN RecoverableZooKeeper:272 - Possibly transient ZooKeeper, quorum=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
2018-10-11 17:44:51 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.182/10.0.11.182:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:51 INFO ClientCnxn:852 - Socket connection established to 10.0.11.182/10.0.11.182:2181, initiating session
2018-10-11 17:44:51 WARN ClientCnxn:1102 - Session 0x0 for server 10.0.11.182/10.0.11.182:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-10-11 17:44:51 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.184/10.0.11.184:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:51 INFO ClientCnxn:852 - Socket connection established to 10.0.11.184/10.0.11.184:2181, initiating session
2018-10-11 17:44:51 INFO ClientCnxn:1235 - Session establishment complete on server 10.0.11.184/10.0.11.184:2181, sessionid = 0x36445aa7b202309, negotiated timeout = 90000
2018-10-11 17:44:51 INFO SparkHadoopMapRedUtil:54 - No need to commit output of task because needsTaskCommit=false: attempt_20181011173732_0013_m_000179_0
2018-10-11 17:44:51 INFO Executor:54 - Finished task 179.0 in stage 1.0 (TID 181). 2385 bytes result sent to driver
2018-10-11 17:44:51 INFO CoarseGrainedExecutorBackend:54 - Got assigned task 182
2018-10-11 17:44:51 INFO Executor:54 - Running task 180.0 in stage 1.0 (TID 182)
2018-10-11 17:44:51 INFO ShuffleBlockFetcherIterator:54 - Getting 1 non-empty blocks out of 2 blocks
2018-10-11 17:44:51 INFO ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 0 ms
2018-10-11 17:44:51 INFO FileOutputCommitter:108 - File Output Committer Algorithm version is 1
2018-10-11 17:44:51 INFO RecoverableZooKeeper:120 - Process identifier=hconnection-0xac40875 connecting to ZooKeeper ensemble=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181
2018-10-11 17:44:51 INFO ZooKeeper:438 - Initiating client connection, connectString=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181 sessionTimeout=90000 watcher=hconnection-0xac408750x0, quorum=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181, baseZNode=/hbase
2018-10-11 17:44:51 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.184/10.0.11.184:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:51 INFO ClientCnxn:852 - Socket connection established to 10.0.11.184/10.0.11.184:2181, initiating session
2018-10-11 17:44:51 WARN ClientCnxn:1102 - Session 0x0 for server 10.0.11.184/10.0.11.184:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-10-11 17:44:52 WARN RecoverableZooKeeper:272 - Possibly transient ZooKeeper, quorum=hb-bp1xiri1tfw60k8c3-002.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-001.hbase.rds.aliyuncs.com:2181,hb-bp1xiri1tfw60k8c3-003.hbase.rds.aliyuncs.com:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
2018-10-11 17:44:52 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.183/10.0.11.183:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:52 INFO ClientCnxn:852 - Socket connection established to 10.0.11.183/10.0.11.183:2181, initiating session
2018-10-11 17:44:52 WARN ClientCnxn:1102 - Session 0x0 for server 10.0.11.183/10.0.11.183:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-10-11 17:44:52 INFO ClientCnxn:975 - Opening socket connection to server 10.0.11.182/10.0.11.182:2181. Will not attempt to authenticate using SASL (unknown error)
2018-10-11 17:44:52 INFO ClientCnxn:852 - Socket connection established to 10.0.11.
连接超时了。
由于使用的是hadoop社区的TableOutputFormat,插件来写hbase。该插件有一个bug,就是每次生成writer里面构造的hbase connection,不会自动释放。而你这边是在local模式一个线程去跑很多write task,这样local ip对zk的连接超出了zk单节点连接上限60,造成connection的泄漏。
建议使用sql/dataframe方式去读写hbase
git@github.com:lw309637554/alicloud-hbase-spark-examples.git
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。