HBase写HLog超时导致RegionServer退出

简介:

当hadoop的hdfs-site.xml中配置的dfs.socket.timeout的值比hbase中配置的大时, hbase在写入hlog时会报如下错误:

解决办法: 保证hadoop的hdfs-site.xml中配置的dfs.socket.timeout的值与HBase一致

10.9.141.165 RegionServer报错如下:

2013-04-15 01:05:49,476 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_5280454841001477955_73253980java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.9.141.165:23420 remote=/10.9.141.165:50010]

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
    at java.io.DataInputStream.readFully(DataInputStream.java:178)
    at java.io.DataInputStream.readLong(DataInputStream.java:399)
    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2514)

2013-04-15 01:05:49,476 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_5280454841001477955_73253980 bad datanode[0] 10.9.141.165:50010
2013-04-15 01:05:49,476 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_5280454841001477955_73253980 in pipeline 10.9.141.165:50010, 10.9.141.152:50010, 10.9.141.158:50010: bad datanode 10.9.141.165:50010
2013-04-15 01:06:55,633 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_5280454841001477955_73262690java.net.SocketTimeoutException: 66000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.9.141.165:41078 remote=/10.9.141.152:50010]

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
    at java.io.DataInputStream.readFully(DataInputStream.java:178)
    at java.io.DataInputStream.readLong(DataInputStream.java:399)
    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2514)

2013-04-15 01:06:55,634 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_5280454841001477955_73262690 bad datanode[0] 10.9.141.152:50010
2013-04-15 01:06:55,634 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_5280454841001477955_73262690 in pipeline 10.9.141.152:50010, 10.9.141.158:50010: bad datanode 10.9.141.152:50010
2013-04-15 01:07:58,716 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_5280454841001477955_73262880java.net.SocketTimeoutExcept
ion: 63000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.9.141.165:48547 remote=/10.9.141.158:50010]

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
    at java.io.DataInputStream.readFully(DataInputStream.java:178)
    at java.io.DataInputStream.readLong(DataInputStream.java:399)
    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2514)

2013-04-15 01:07:58,718 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_5280454841001477955_73262880 bad datanode[0] 10.9.141.158:50010

其中三台datanode报错如下:

10.9.141.152

2013-04-15 01:00:07,399 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73253980 src: /10.9.141.165:39523 dest: /10.9.141.152:50010

2013-04-15 01:05:49,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_5280454841001477955_73253980 java.io.EOFException: while try

ing to read 65557 bytes

2013-04-15 01:05:49,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73253980 1 Exception java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.9.141.152:59490 remote=/10.9.141.158:50010]. 110927 millis timeout left.

    at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349)

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)

    at java.io.DataInputStream.readFully(DataInputStream.java:178)

    at java.io.DataInputStream.readLong(DataInputStream.java:399)

    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:868)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:05:49,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73253980 1 : Thread is interrupted.

2013-04-15 01:05:49,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_5280454841001477955_73253980 terminating

2013-04-15 01:05:49,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73253980 received exception java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:05:49,474 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.152:50010, storageID=DS736845143, infoPort=50075, ipcPort=50020):DataXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:05:49,479 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls recoverBlock(block=blk_5280454841001477955_73253980, targets=[10.9.141.152:50010, 10.9.141.158:50010])

2013-04-15 01:05:49,556 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: oldblock=blk_5280454841001477955_73253980(length=3121152), newblock=blk_5280454841001477955_73262690

(length=3121152), datanode=10.9.141.152:50010

2013-04-15 01:05:49,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73262690 src: /10.9.141.165:41078 dest: /10.9.141.152:50010

2013-04-15 01:05:49,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_5280454841001477955_73262690

2013-04-15 01:06:55,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73262690 1 Exception java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.9.141.152:60943 remote=/10.9.141.158:50010]. 113932 millis timeout left.

    at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349)

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)

    at java.io.DataInputStream.readFully(DataInputStream.java:178)

    at java.io.DataInputStream.readLong(DataInputStream.java:399)

    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:868)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:06:55,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73262690 1 : Thread is interrupted.

2013-04-15 01:06:55,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_5280454841001477955_73262690 terminating

2013-04-15 01:06:55,631 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73262690 received exception java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:06:55,632 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.152:50010, storageID=DS736845143, infoPort=50075, ipcPort=50020):DataXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:05:49,556 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: oldblock=blk_5280454841001477955_73253980(length=3121152), newblock=blk_5280454841001477955_73262690

(length=3121152), datanode=10.9.141.152:50010

2013-04-15 01:05:49,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73262690 src: /10.9.141.165:41078 dest: /10.9.141.152:50010

2013-04-15 01:05:49,561 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_5280454841001477955_73262690

10.9.141.158

2013-04-15 01:00:07,420 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73253980 src: /10.9.141.152:59490 dest: /10.9.141.158:50010

2013-04-15 01:05:49,495 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_5280454841001477955_73253980 java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:05:49,495 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73253980 Interrupted.

2013-04-15 01:05:49,495 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73253980 terminating

2013-04-15 01:05:49,495 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73253980 received exception java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:05:49,495 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.158:50010, storageID=DS2062116090, infoPort=50075, ipcPort=50020):DataXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:05:49,578 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: oldblock=blk_5280454841001477955_73253980(length=3121152), newblock=blk_5280454841001477955_73262690

(length=3121152), datanode=10.9.141.158:50010

2013-04-15 01:05:49,581 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73262690 src: /10.9.141.152:60943 dest: /10.9.141.158:50010

2013-04-15 01:05:49,582 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_5280454841001477955_73262690

2013-04-15 01:06:55,652 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73262690 Interrupted.

2013-04-15 01:06:55,652 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73262690 terminating

2013-04-15 01:06:55,652 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73262690 received exception java.io.EOFException: while trying to

read 65557 bytes

2013-04-15 01:06:55,652 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.158:50010, storageID=DS2062116090, infoPort=50075, ipcPort=50020):DataXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:06:55,655 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls recoverBlock(block=blk_5280454841001477955_73262690, targets=[10.9.141.158:50010])

2013-04-15 01:06:55,666 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: oldblock=blk_5280454841001477955_73262690(length=3121152), newblock=blk_5280454841001477955_73262880(length=3121152), datanode=10.9.141.158:50010

2013-04-15 01:06:55,669 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73262880 src: /10.9.141.165:48547 dest: /10.9.141.158:50010

2013-04-15 01:06:55,669 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_5280454841001477955_73262880

2013-04-15 01:07:58,735 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73262880 Interrupted.

2013-04-15 01:07:58,735 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_5280454841001477955_73262880 terminating

2013-04-15 01:07:58,735 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73262880 received exception java.io.EOFException: while trying to

read 65557 bytes

2013-04-15 01:07:58,735 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.158:50010, storageID=DS2062116090, infoPort=50075, ipcPort=50020):Dat

aXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)

10.9.141.165

2013-04-15 01:00:07,407 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5280454841001477955_73253980 src: /10.9.141.165:23420 dest: /10.9.141.165:50010

2013-04-15 01:05:49,476 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_5280454841001477955_73253980 java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:05:49,476 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73253980 2 Exception java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.9.141.165:39523 remote=/10.9.141.152:50010]. 290930 millis timeout left.

    at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:349)

    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)

    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)

    at java.io.DataInputStream.readFully(DataInputStream.java:178)

    at java.io.DataInputStream.readLong(DataInputStream.java:399)

    at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:868)

    at java.lang.Thread.run(Thread.java:662)

2013-04-15 01:05:49,476 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_5280454841001477955_73253980 2 : Thread is interrupted.

2013-04-15 01:05:49,476 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block blk_5280454841001477955_73253980 terminating

2013-04-15 01:05:49,477 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_5280454841001477955_73253980 received exception java.io.EOFException: while trying to read 65557 bytes

2013-04-15 01:05:49,478 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.9.141.165:50010, storageID=DS-1327849832, infoPort=50075, ipcPort=50020):DataXceiver

java.io.EOFException: while trying to read 65557 bytes

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:265)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)

    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)

    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

    at java.lang.Thread.run(Thread.java:662)
相关实践学习
lindorm多模间数据无缝流转
展现了Lindorm多模融合能力——用kafka API写入,无缝流转在各引擎内进行数据存储和计算的实验。
云数据库HBase版使用教程
  相关的阿里云产品:云数据库 HBase 版 面向大数据领域的一站式NoSQL服务,100%兼容开源HBase并深度扩展,支持海量数据下的实时存储、高并发吞吐、轻SQL分析、全文检索、时序时空查询等能力,是风控、推荐、广告、物联网、车联网、Feeds流、数据大屏等场景首选数据库,是为淘宝、支付宝、菜鸟等众多阿里核心业务提供关键支撑的数据库。 了解产品详情: https://cn.aliyun.com/product/hbase   ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库 ECS 实例和一台目标数据库 RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
4月前
|
DataWorks 数据管理 大数据
DataWorks操作报错合集之在连接HBase时出现超时问题,该怎么解决
DataWorks是阿里云提供的一站式大数据开发与治理平台,支持数据集成、数据开发、数据服务、数据质量管理、数据安全管理等全流程数据处理。在使用DataWorks过程中,可能会遇到各种操作报错。以下是一些常见的报错情况及其可能的原因和解决方法。
|
存储 缓存 NoSQL
HBase与HDFS之间的WAL(HLog)存储机制答疑解惑
HBase与HDFS之间的WAL(HLog)存储机制答疑解惑
|
存储 缓存 监控
一次HBase读超时的调优
现象:因为系统实时性要求比较高,HBase超时时间设置为2秒。偶尔会出现(几个小时)出现一波超时的情况,看了监控IO、CPU等并没有出现明显大波动。不过集群是高读写的,每秒几万的请求。就开始参与协助帮忙集群的排查、调优工作。 汗,最关键的是集群都用上了SSD,这是开大的节奏。 先来看看HBase主要的几个参数: 1、major compaction(大合并操作,几天执行一次,或者手动执行。对IO影响很大,对性能影响也很大) 2、memstore:regions数量、列簇数量有影响 ,一个列簇就需要一个memstore ,会占用region server的内存。 3、负载均衡:是不是某
206 0
|
分布式数据库 Hbase
|
缓存 Java 分布式数据库
排查HBase内存泄漏导致RegionServer挂掉问题
问题描述 在测试Phoenix稳定性时,发现HBase集群其中一台RegionServer节点FullGC严重,隔一段时间就会挂掉。 HBase集群规格       初步分析 使用jstat监控RegionServer的Heap Size和垃圾回收情况 Old区内存一直在90%多,且FullGC次数一直在增多。
2100 0
|
监控 Java 分布式数据库
|
存储 弹性计算 网络协议
[原创]TCP的backlog导致的HBase访问超时问题排查(续)
接前一篇文章 TCP的backlog导致的HBase超时问题 https://yq.aliyun.com/articles/117801?spm=5176.8091938.0.0.kypXIC ##问题场景 ![1.jpg](http://ata2-img.cn-hangzhou.img-pub.aliyun-inc.com/9fda470aa6727587b15909e9a0
3054 0