大家好,我在部署的时候发现了一个问题,我通过restAPI接口停掉了一个任务并保存了它的savepoint(步骤:/jobs/overview ---> /jobs/{jobid}/savepoints ---> /jobs/{jobid}/savepoints/{triggerid}),但我通过flink命令带上savepoint部署任务时会报错,但通过webui上传jar并带上savepoint就不会报错,报错堆栈如下:
2020-07-17 09:51:48,925 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Request slot with profile ResourceProfile{UNKNOWN} for job 7639673873b707aa86c4387aa7b4aac3 with allocation id e8865cdbfe4c3c33099c7112bc2e3231.
2020-07-17 09:51:48,952 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,952 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source -> Filter (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,953 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,953 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Co-Process (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,955 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,955 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:49,346 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,370 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,370 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,377 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,377 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,493 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from RUNNING to FAILED.
java.lang.Exception: Exception while creating StreamOperatorStateContext.
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:191)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:255)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for LegacyKeyedCoProcessOperator_65e7116c7aa972ad18a796ae22bd6327_(1/1) from any of the 1 provided restore options.
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:304)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:131)
... 9 more
Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected exception.
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:336)
at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:548)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:288)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
... 11 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:221)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:168)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:151)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:279)
... 15 more
*来自志愿者整理的flink邮件归档
请问你使用哪个版本的 Flink 呢?能否分享一下 Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) 这个 tm 的 log 呢?从上面给的日志看,应该是在 083f69d029de 这台机器上。*来自志愿者整理的flink邮件归档
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。