【问题】The coprocessor thread stopped itself due to scan timeout or scan threshold

简介:

Kylin执行查询语句的时候报错如下:

Error while executing SQL "select t.hotel_id_m,t.live_dt, d.day_of_week,sum(rns) from tableT t join TableD d on t.live_dt = d.daY_no group by t.hotel_id_m,t.live_dt, d.day_of_week LIMIT 50000": <sub-thread for Query ac580b70-96f2-403a-a64a-0557e599d35f GTScanRequest 143ae1ba>The coprocessor thread stopped itself due to scan timeout or scan threshold(check region server log), failing current query...


查看regionserver日志

2017-03-20 11:10:05,436 INFO  [Query dc7017bb-fefc-4177-a2c9-5842625beb89-109] endpoint.CubeVisitService: Scanned 9999001 rows from HBase.

2017-03-20 11:10:05,454 INFO  [Query dc7017bb-fefc-4177-a2c9-5842625beb89-109] endpoint.CubeVisitService: The cube visit did not finish normally because scan num exceeds threshold

org.apache.kylin.gridtable.GTScanExceedThresholdException: Exceed scan threshold at 10000001

at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService$1.hasNext(CubeVisitService.java:258)

at org.apache.kylin.storage.hbase.cube.v2.HBaseReadonlyStore$1$1.hasNext(HBaseReadonlyStore.java:111)

at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:290)

at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:4117)

at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7797)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1982)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1964)

at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33652)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)

2017-03-20 11:10:05,459 WARN  [RpcServer.FifoWFPBQ.default.handler=59,queue=5,port=60020] ipc.RpcServer: (responseTooSlow): {"call":"ExecService(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$CoprocessorServiceRequest)","starttimems":1489979376495,"responsesize":359,"method":"ExecService","processingtimems":28964,"client":"10.10.16.102:58720","queuetimems":1,"class":"HRegionServer"}


分析解决:

在${KYLIN_HOME}/conf/kylin.property配置文件中有一个配置项

1
kylin.query.scan.threshold=10000000

这个配置项限制了Hbase scan的行数,当scan超过kylin.query.scan.threshold行的时候还是不能满足查询需求,则kylin取消hbase端的查询。

目前暂时有两种方式解决

1、增加kylin.query.scan.threshold值,这相应也会增加hbase压力,在Hbase能够顶住查询压力的情况下,可以增大该值

2、此次查询语句中有LIMIT 50000,Hbase在scan了kylin.query.scan.threshold行之后,让不能扫描出50000行。可以减小LIMIT的值,在Hbase在scan到达kylin.query.scan.threshold之前,就能满足查询要求。则不会出现这个问题




     本文转自巧克力黒 51CTO博客,原文链接:http://blog.51cto.com/10120275/1908363,如需转载请自行联系原作者


相关实践学习
云数据库HBase版使用教程
&nbsp; 相关的阿里云产品:云数据库 HBase 版 面向大数据领域的一站式NoSQL服务,100%兼容开源HBase并深度扩展,支持海量数据下的实时存储、高并发吞吐、轻SQL分析、全文检索、时序时空查询等能力,是风控、推荐、广告、物联网、车联网、Feeds流、数据大屏等场景首选数据库,是为淘宝、支付宝、菜鸟等众多阿里核心业务提供关键支撑的数据库。 了解产品详情:&nbsp;https://cn.aliyun.com/product/hbase &nbsp; ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库&nbsp;ECS 实例和一台目标数据库&nbsp;RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&amp;RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
相关文章
|
11月前
|
数据安全/隐私保护
Do Sync Disk 0 Part 0 Failed, code=S3_F42, msg=Sync Failed after retry 5 times
Do Sync Disk 0 Part 0 Failed, code=S3_F42, msg=Sync Failed after retry 5 times
154 1
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
484 0
‘Client‘ is not allowed to run in parallel.Would you like to stop the running one?
成功解决ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED L
成功解决ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED L
|
消息中间件 Kafka
Batch containing 11 record(s) expired due to timeout while requesting metadata
背景: Kafka集群在一个192.168.0.x网段的,而我们的生产者在192.168.17.x网段的一台机器上,故当生产者发送消息给Kafka时, 无法将消息发送过去。
4333 0