Flink部署问题之带上savepoint部署任务报错如何解决

本文涉及的产品
实时计算 Flink 版,5000CU*H 3个月
简介: Apache Flink是由Apache软件基金会开发的开源流处理框架,其核心是用Java和Scala编写的分布式流数据流引擎。本合集提供有关Apache Flink相关技术、使用技巧和最佳实践的资源。

问题一:Flink Cli 部署问题

大家好,我在部署的时候发现了一个问题,我通过restAPI接口停掉了一个任务并保存了它的savepoint(步骤:/jobs/overview ---> /jobs/{jobid}/savepoints ---> /jobs/{jobid}/savepoints/{triggerid}),但我通过flink命令带上savepoint部署任务时会报错,但通过webui上传jar并带上savepoint就不会报错,报错堆栈如下:

2020-07-17 09:51:48,925 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Request slot with profile ResourceProfile{UNKNOWN} for job 7639673873b707aa86c4387aa7b4aac3 with allocation id e8865cdbfe4c3c33099c7112bc2e3231.

2020-07-17 09:51:48,952 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from SCHEDULED to DEPLOYING.

2020-07-17 09:51:48,952 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Source: Custom Source -> Filter (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)

2020-07-17 09:51:48,953 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from SCHEDULED to DEPLOYING.

2020-07-17 09:51:48,953 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)

2020-07-17 09:51:48,954 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from SCHEDULED to DEPLOYING.

2020-07-17 09:51:48,954 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)

2020-07-17 09:51:48,954 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from SCHEDULED to DEPLOYING.

2020-07-17 09:51:48,954 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Co-Process (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)

2020-07-17 09:51:48,955 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from SCHEDULED to DEPLOYING.

2020-07-17 09:51:48,955 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)

2020-07-17 09:51:49,346 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from DEPLOYING to RUNNING.

2020-07-17 09:51:49,370 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from DEPLOYING to RUNNING.

2020-07-17 09:51:49,370 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from DEPLOYING to RUNNING.

2020-07-17 09:51:49,377 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from DEPLOYING to RUNNING.

2020-07-17 09:51:49,377 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from DEPLOYING to RUNNING.

2020-07-17 09:51:49,493 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from RUNNING to FAILED.

java.lang.Exception: Exception while creating StreamOperatorStateContext.

at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:191)

at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:255)

at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)

at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)

at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)

at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)

at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)

at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)

at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)

at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for LegacyKeyedCoProcessOperator_65e7116c7aa972ad18a796ae22bd6327_(1/1) from any of the 1 provided restore options.

at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)

at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:304)

at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:131)

... 9 more

Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected exception.

at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:336)

at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:548)

at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:288)

at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)

at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)

... 11 more

Caused by: java.io.EOFException

at java.io.DataInputStream.readFully(DataInputStream.java:197)

at java.io.DataInputStream.readFully(DataInputStream.java:169)

at org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85)

at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:221)

at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:168)

at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:151)

at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:279)

... 15 more

*来自志愿者整理的flink邮件归档



参考答案:

请问你使用哪个版本的 Flink 呢?能否分享一下 Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) 这个 tm 的 log 呢?从上面给的日志看,应该是在 083f69d029de 这台机器上。*来自志愿者整理的flink邮件归档



关于本问题的更多回答可点击进行查看:

https://developer.aliyun.com/ask/370235?spm=a2c6h.12873639.article-detail.58.6f9243783Lv0fl



问题二:flink1.11 run

hi,我这面请一个一个kafka到hive的程序,但程序无法运行,请问什么原因:

异常: The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: No operators defined in streaming topology. Cannot generate StreamGraph. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:699) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:232) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) Caused by: java.lang.IllegalStateException: No operators defined in streaming topology. Cannot generate StreamGraph. at org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47) at org.apache.flink.table.planner.delegation.StreamExecutor.createPipeline(StreamExecutor.java:47) at org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1197) at com.akulaku.data.flink.StreamingWriteToHive.main(StreamingWriteToHive.java:80) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288) ... 11 more 代码:

StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment(); EnvironmentSettings settings = EnvironmentSettings.newInstance().inStreamingMode().useBlinkPlanner().build(); StreamTableEnvironment tableEnv = StreamTableEnvironment.create(environment, settings);

environment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); environment.setStateBackend(new MemoryStateBackend()); environment.getCheckpointConfig().setCheckpointInterval(5000);

String name = "myhive"; String defaultDatabase = "tmp"; String hiveConfDir = "/etc/alternatives/hive-conf/"; String version = "1.1.0";

HiveCatalog hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version); tableEnv.registerCatalog("myhive", hive); tableEnv.useCatalog("myhive");

tableEnv.executeSql("CREATE TABLE tmp.user_behavior (\n" + " user_id BIGINT,\n" + " item_id STRING,\n" + " behavior STRING,\n" + " ts AS PROCTIME()\n" + ") WITH (\n" + " 'connector' = 'kafka-0.11',\n" + " 'topic' = 'user_behavior',\n" + " 'properties.bootstrap.servers' = 'localhost:9092',\n" + " 'properties.group.id' = 'testGroup',\n" + " 'scan.startup.mode' = 'earliest-offset',\n" + " 'format' = 'json',\n" + " 'json.fail-on-missing-field' = 'false',\n" + " 'json.ignore-parse-errors' = 'true'\n" + ")");

// tableEnv.executeSql("CREATE TABLE print_table (\n" + // " user_id BIGINT,\n" + // " item_id STRING,\n" + // " behavior STRING,\n" + // " tsdata STRING\n" + // ") WITH (\n" + // " 'connector' = 'print'\n" + // ")"); tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE); tableEnv.executeSql("CREATE TABLE tmp.streamhivetest (\n" + " user_id BIGINT,\n" + " item_id STRING,\n" + " behavior STRING,\n" + " tsdata STRING\n" + ") STORED AS parquet TBLPROPERTIES (\n" + " 'sink.rolling-policy.file-size' = '12MB',\n" + " 'sink.rolling-policy.rollover-interval' = '1 min',\n" + " 'sink.rolling-policy.check-interval' = '1 min',\n" + " 'execution.checkpointing.interval' = 'true'\n" + ")");

tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT); tableEnv.executeSql("insert into streamhivetest select user_id,item_id,behavior,DATE_FORMAT(ts, 'yyyy-MM-dd') as tsdata from user_behavior");

tableEnv.execute("stream-write-hive");

*来自志愿者整理的flink邮件归档



参考答案:

tableEnv.executeSql就已经提交作业了,不需要再执行execute了哈*来自志愿者整理的flink邮件归档



关于本问题的更多回答可点击进行查看:

https://developer.aliyun.com/ask/370234?spm=a2c6h.12873639.article-detail.59.6f9243783Lv0fl



问题三:Re: pyflink1.11.0window

你的source ddl里有指定time1为 time attribute吗? create table source1( id int, time1 timestamp, type string, WATERMARK FOR time1 as time1 - INTERVAL '2' SECOND ) with (...)

*来自志愿者整理的flink邮件归档



参考答案:

org.apache.flink.table.api.ValidationException: A tumble window
expects a size value literal.
看起来是接下tumble window定义的代码不太正确吧

*来自志愿者整理的flink邮件归档



关于本问题的更多回答可点击进行查看:

https://developer.aliyun.com/ask/370233?spm=a2c6h.12873639.article-detail.60.6f9243783Lv0fl



问题四:Flink 1.11.2 读写Hive以及对hive的版本支持

我这面在flink中注册hivecatalog,想将kafka数据流式写入到hive表中,但是现在建立kafka表的时候默认会保存元数据到hive表,请问有办法不保存这个kafka元数据表吗?如果不注册hivecatalog的话没办法写数据到hive吧。。。。

*来自志愿者整理的flink邮件归档



参考答案:

CREATE TEMPORARY TABLE kafka_table...

好像没文档,我建个JIRA跟踪下

https://issues.apache.org/jira/browse/FLINK-18624*来自志愿者整理的flink邮件归档



关于本问题的更多回答可点击进行查看:

https://developer.aliyun.com/ask/370232?spm=a2c6h.12873639.article-detail.61.6f9243783Lv0fl



问题五:Flink on k8s 中,Jar 任务 avatica-core 依赖和 flink-table

我现在正在迁移任务到 k8s ,目前版本为 Flink 1.6 版本,k8s 上面作业运行模式为 standalone per job.

现在遇到一个问题,业务方 Flink jar 任务使用了 org.apache.calcite.avatica 依赖,也就是下面依赖:

org.apache.calcite.avatica

avatica-core

${avatica.version}

但是这个依赖其实在 flink-table 模块中,也有这个依赖:

[image: image.png]

由于 flink on k8s standalone per job 模式,会把 Flink 任务 jar 包放入到 flink 本身的lib

包中,我在任务启动的时候,就会报:

Caused by: java.lang.NoClassDefFoundError: Could not initialize class

org.apache.calcite.avatica.ConnectionPropertiesImpl 错误。

按照我的理解,由于 Flink jar 任务包中有 avatica-core 依赖,同时在 flink lib

目录下面,flink-table_2.11-1.6-RELEASE.jar 中也有这个依赖,这两个都在 lib 目录下,然后就出现了类冲突问题。

请问怎么解决这个问题呢,非常期待你的回复。

*来自志愿者整理的flink邮件归档



参考答案:

如果单纯想解决 jar 包冲突的问题,那么 maven shade plugin[1] 或许对你有用

[1]

https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html

Best,*来自志愿者整理的flink邮件归档



关于本问题的更多回答可点击进行查看:

https://developer.aliyun.com/ask/370231?spm=a2c6h.12873639.article-detail.62.6f9243783Lv0fl

相关实践学习
基于Hologres轻松玩转一站式实时仓库
本场景介绍如何利用阿里云MaxCompute、实时计算Flink和交互式分析服务Hologres开发离线、实时数据融合分析的数据大屏应用。
Linux入门到精通
本套课程是从入门开始的Linux学习课程,适合初学者阅读。由浅入深案例丰富,通俗易懂。主要涉及基础的系统操作以及工作中常用的各种服务软件的应用、部署和优化。即使是零基础的学员,只要能够坚持把所有章节都学完,也一定会受益匪浅。
相关文章
|
4天前
|
Java 大数据 流计算
使用Docker快速部署Flink分布式集群
使用Docker快速部署Flink分布式集群
14 0
|
10天前
|
NoSQL Java MongoDB
实时计算 Flink版产品使用合集之在一个任务中创建了多个MySQLCDC源表,这些源表是否共享同一个数据库连接池
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
SQL 关系型数据库 MySQL
实时计算 Flink版产品使用合集之在Application模式下,如何在客户端中同步获取任务执行结果后再退出
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
SQL Java 数据库连接
实时计算 Flink版产品使用合集之同步数据到Doris的任务中,遇到在继续同步后再次点击下线无法下线,如何处理
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
SQL Oracle 关系型数据库
实时计算 Flink版产品使用合集之从Oracle数据库同步数据时,checkpoint恢复后无法捕获到任务暂停期间的变更日志,如何处理
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
18天前
|
API Apache 流计算
Flink在保存savepoint时出现超时错误
Flink在保存savepoint时出现超时错误【1月更文挑战第6天】【1月更文挑战第28篇】
194 1
|
10天前
|
消息中间件 Kafka 分布式数据库
实时计算 Flink版产品使用合集之如何批量读取Kafka数据
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
SQL JSON 资源调度
实时计算 Flink版产品使用合集之如何指定FlinkYarnSession启动的properties文件存放位置
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
SQL 监控 Oracle
实时计算 Flink版产品使用合集之如何指定表的隐藏列为主键
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
10天前
|
消息中间件 SQL Java
实时计算 Flink版产品使用合集之管理内存webui上一直是百分百是什么导致的
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStreamAPI、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。

热门文章

最新文章

相关产品

  • 实时计算 Flink版