hadoop Unexpected end of input stream 错误

简介:

线上一个job出错,报错信息如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Diagnostic Messages  for  this  Task:
Error
java.io.IOException: java.io.EOFException: Unexpected end of input stream
         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java: 121 )
         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java: 77 )
         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java: 276 )
         at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java: 79 )
         at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java: 33 )
         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java: 108 )
         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java: 196 )
         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java: 182 )
         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java: 52 )
         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java: 428 )
         at org.apache.hadoop.mapred.MapTask.run(MapTask.java: 340 )
         at org.apache.hadoop.mapred.YarnChild$ 2 .run(YarnChild.java: 160 )
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java: 396 )
         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 1438 )
         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java: 155 )
Caused by: java.io.EOFException: Unexpected end of input stream
         at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java: 143 )
         at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java: 83 )
         at java.io.InputStream.read(InputStream.java: 82 )
         at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java: 209 )
         at org.apache.hadoop.util.LineReader.readLine(LineReader.java: 173 )
         at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java: 206 )
         at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java: 45 )
         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java: 274 )
         ...  13  more

从报错信息来看,是和io读取有关系的,即在map阶段,数据读取出错导致。

通过explain extended查看读取的文件,因为是text的gz文件,使用zcat进行测试,最终定位到是由于gz文件异常导致。将有问题的数据删除后job恢复。



本文转自菜菜光 51CTO博客,原文链接:http://blog.51cto.com/caiguangguang/1436077,如需转载请自行联系原作者

相关文章
|
10月前
|
索引
Elasticsearch exception [type=illegal_argument_exception, reason=index [.1] is the write index for data stream [slowlog] and cannot be deleted]
在 Elasticsearch 中,你尝试删除的索引是一个数据流(data stream)的一部分,而且是数据流的写入索引(write index),因此无法直接删除它。为了解决这个问题,你可以按照以下步骤进行操作:
813 0
|
SQL Oracle 关系型数据库
Hive:Error while compiling statement: FAILED: ParseException cannot recognize input near '<EOF>' '<
Hive:Error while compiling statement: FAILED: ParseException cannot recognize input near '<EOF>' '<
Hive:Error while compiling statement: FAILED: ParseException cannot recognize input near '<EOF>' '<
|
10月前
|
Java 应用服务中间件
完美解决tomcat启动异常:Invalid byte tag in constant pool: 19;Unable to process Jar entry [module-info.class]
完美解决tomcat启动异常:Invalid byte tag in constant pool: 19;Unable to process Jar entry [module-info.class]
1460 0
|
10月前
|
SQL 分布式计算 资源调度
[已解决]FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to
[已解决]FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to
389 0
|
分布式计算 Hadoop 程序员
ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
|
监控 Linux Apache
访问zabbix安装页面报错500,apache报错Call to undefined function mb_detect_encoding()
访问zabbix安装页面报错500,apache报错Call to undefined function mb_detect_encoding()
285 0
成功解决absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'data_format'
成功解决absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'data_format'
|
分布式计算 Ubuntu Hadoop
MapReduce报错:「MKDirs failed to create file」
MapReduce报错:「MKDirs failed to create file」
351 0
MapReduce报错:「MKDirs failed to create file」
|
Linux 流计算
Flink - 本地执行 Failed to start the Queryable State Data Server
Flink 本地执行任务报错 Failed to start the Queryable State Data Server 以及 Unable to start Queryable State Server. All ports in provided range are occupied. 根据报错分析是因为本地端口被占用,没有足够端口供 Flink 本地客户端启动,所以解决方法就是处理被占用的端口。...
168 0
Flink - 本地执行 Failed to start the Queryable State Data Server
|
资源调度
yarn start error Command failed with exit code 1解决
yarn start error Command failed with exit code 1解决
1246 0