问题一:Flink Cli 部署问题
大家好,我在部署的时候发现了一个问题,我通过restAPI接口停掉了一个任务并保存了它的savepoint(步骤:/jobs/overview ---> /jobs/{jobid}/savepoints ---> /jobs/{jobid}/savepoints/{triggerid}),但我通过flink命令带上savepoint部署任务时会报错,但通过webui上传jar并带上savepoint就不会报错,报错堆栈如下:
2020-07-17 09:51:48,925 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Request slot with profile ResourceProfile{UNKNOWN} for job 7639673873b707aa86c4387aa7b4aac3 with allocation id e8865cdbfe4c3c33099c7112bc2e3231.
2020-07-17 09:51:48,952 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,952 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source -> Filter (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,953 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,953 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Source: Custom Source (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,954 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Co-Process (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:48,955 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from SCHEDULED to DEPLOYING.
2020-07-17 09:51:48,955 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (attempt #0) to e63d829deafc144cd82efd73979dd056 @ 083f69d029de (dataPort=35758)
2020-07-17 09:51:49,346 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process -> (Sink: Unnamed, Sink: Unnamed) (1/1) (618b75fcf5ea05fb5c6487bec6426e31) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,370 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (274b3df03e1fab627059c1a78e4a26da) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,370 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source (1/1) (141f0dc22b624b39e21127f637ba63c2) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,377 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,377 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source -> Filter (1/1) (1177659bff014e8dbc3f0508055d4307) switched from DEPLOYING to RUNNING.
2020-07-17 09:51:49,493 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) switched from RUNNING to FAILED.
java.lang.Exception: Exception while creating StreamOperatorStateContext.
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:191)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:255)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for LegacyKeyedCoProcessOperator_65e7116c7aa972ad18a796ae22bd6327_(1/1) from any of the 1 provided restore options.
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:304)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:131)
... 9 more
Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected exception.
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:336)
at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:548)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:288)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121)
... 11 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:221)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:168)
at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:151)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:279)
... 15 more
*来自志愿者整理的flink邮件归档
参考答案:
请问你使用哪个版本的 Flink 呢?能否分享一下 Co-Process (1/1) (d0309f26a545e74643382ed3f758269b) 这个 tm 的 log 呢?从上面给的日志看,应该是在 083f69d029de 这台机器上。*来自志愿者整理的flink邮件归档
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/370235?spm=a2c6h.12873639.article-detail.58.6f9243783Lv0fl
问题二:flink1.11 run
hi,我这面请一个一个kafka到hive的程序,但程序无法运行,请问什么原因:
异常: The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: No operators defined in streaming topology. Cannot generate StreamGraph. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:302) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:198) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:149) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:699) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:232) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:916) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) Caused by: java.lang.IllegalStateException: No operators defined in streaming topology. Cannot generate StreamGraph. at org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47) at org.apache.flink.table.planner.delegation.StreamExecutor.createPipeline(StreamExecutor.java:47) at org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:1197) at com.akulaku.data.flink.StreamingWriteToHive.main(StreamingWriteToHive.java:80) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:288) ... 11 more 代码:
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment(); EnvironmentSettings settings = EnvironmentSettings.newInstance().inStreamingMode().useBlinkPlanner().build(); StreamTableEnvironment tableEnv = StreamTableEnvironment.create(environment, settings);
environment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); environment.setStateBackend(new MemoryStateBackend()); environment.getCheckpointConfig().setCheckpointInterval(5000);
String name = "myhive"; String defaultDatabase = "tmp"; String hiveConfDir = "/etc/alternatives/hive-conf/"; String version = "1.1.0";
HiveCatalog hive = new HiveCatalog(name, defaultDatabase, hiveConfDir, version); tableEnv.registerCatalog("myhive", hive); tableEnv.useCatalog("myhive");
tableEnv.executeSql("CREATE TABLE tmp.user_behavior (\n" + " user_id BIGINT,\n" + " item_id STRING,\n" + " behavior STRING,\n" + " ts AS PROCTIME()\n" + ") WITH (\n" + " 'connector' = 'kafka-0.11',\n" + " 'topic' = 'user_behavior',\n" + " 'properties.bootstrap.servers' = 'localhost:9092',\n" + " 'properties.group.id' = 'testGroup',\n" + " 'scan.startup.mode' = 'earliest-offset',\n" + " 'format' = 'json',\n" + " 'json.fail-on-missing-field' = 'false',\n" + " 'json.ignore-parse-errors' = 'true'\n" + ")");
// tableEnv.executeSql("CREATE TABLE print_table (\n" + // " user_id BIGINT,\n" + // " item_id STRING,\n" + // " behavior STRING,\n" + // " tsdata STRING\n" + // ") WITH (\n" + // " 'connector' = 'print'\n" + // ")"); tableEnv.getConfig().setSqlDialect(SqlDialect.HIVE); tableEnv.executeSql("CREATE TABLE tmp.streamhivetest (\n" + " user_id BIGINT,\n" + " item_id STRING,\n" + " behavior STRING,\n" + " tsdata STRING\n" + ") STORED AS parquet TBLPROPERTIES (\n" + " 'sink.rolling-policy.file-size' = '12MB',\n" + " 'sink.rolling-policy.rollover-interval' = '1 min',\n" + " 'sink.rolling-policy.check-interval' = '1 min',\n" + " 'execution.checkpointing.interval' = 'true'\n" + ")");
tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT); tableEnv.executeSql("insert into streamhivetest select user_id,item_id,behavior,DATE_FORMAT(ts, 'yyyy-MM-dd') as tsdata from user_behavior");
tableEnv.execute("stream-write-hive");
*来自志愿者整理的flink邮件归档
参考答案:
tableEnv.executeSql就已经提交作业了,不需要再执行execute了哈*来自志愿者整理的flink邮件归档
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/370234?spm=a2c6h.12873639.article-detail.59.6f9243783Lv0fl
问题三:Re: pyflink1.11.0window
你的source ddl里有指定time1为 time attribute吗? create table source1( id int, time1 timestamp, type string, WATERMARK FOR time1 as time1 - INTERVAL '2' SECOND ) with (...)
*来自志愿者整理的flink邮件归档
参考答案:
org.apache.flink.table.api.ValidationException: A tumble window expects a size value literal. 看起来是接下tumble window定义的代码不太正确吧
*来自志愿者整理的flink邮件归档
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/370233?spm=a2c6h.12873639.article-detail.60.6f9243783Lv0fl
问题四:Flink 1.11.2 读写Hive以及对hive的版本支持
我这面在flink中注册hivecatalog,想将kafka数据流式写入到hive表中,但是现在建立kafka表的时候默认会保存元数据到hive表,请问有办法不保存这个kafka元数据表吗?如果不注册hivecatalog的话没办法写数据到hive吧。。。。
*来自志愿者整理的flink邮件归档
参考答案:
CREATE TEMPORARY TABLE kafka_table...
好像没文档,我建个JIRA跟踪下
https://issues.apache.org/jira/browse/FLINK-18624*来自志愿者整理的flink邮件归档
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/370232?spm=a2c6h.12873639.article-detail.61.6f9243783Lv0fl
问题五:Flink on k8s 中,Jar 任务 avatica-core 依赖和 flink-table
我现在正在迁移任务到 k8s ,目前版本为 Flink 1.6 版本,k8s 上面作业运行模式为 standalone per job.
现在遇到一个问题,业务方 Flink jar 任务使用了 org.apache.calcite.avatica 依赖,也就是下面依赖:
org.apache.calcite.avatica
avatica-core
${avatica.version}
但是这个依赖其实在 flink-table 模块中,也有这个依赖:
[image: image.png]
由于 flink on k8s standalone per job 模式,会把 Flink 任务 jar 包放入到 flink 本身的lib
包中,我在任务启动的时候,就会报:
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.calcite.avatica.ConnectionPropertiesImpl 错误。
按照我的理解,由于 Flink jar 任务包中有 avatica-core 依赖,同时在 flink lib
目录下面,flink-table_2.11-1.6-RELEASE.jar 中也有这个依赖,这两个都在 lib 目录下,然后就出现了类冲突问题。
请问怎么解决这个问题呢,非常期待你的回复。
*来自志愿者整理的flink邮件归档
参考答案:
如果单纯想解决 jar 包冲突的问题,那么 maven shade plugin[1] 或许对你有用
[1]
https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
Best,*来自志愿者整理的flink邮件归档
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/370231?spm=a2c6h.12873639.article-detail.62.6f9243783Lv0fl