1、使用以下命令发布任务: ./bin/flink run-application
--target kubernetes-application
-Dkubernetes.cluster-id=my-first-application-cluster \
-Dkubernetes.container.image=registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1
-Dkubernetes.container.image.pull-policy=Always
-Dkubernetes.container-start-command-template="%java% %classpath% %jvmmem% %jvmopts% %logging% %class% %args%"
local:///opt/flink/usrlib/WordCount.jar
2、任务发布后,pod重启失败,用kubectl logs查看日志,出现以下错误: /docker-entrypoint.sh: 125: exec: native-k8s: not found
3、检查了镜像的docker-entrypoint.sh脚本,没有navive-k8s的命令,镜像是基于flink最新的镜像进行构筑的,dockerfile如下: FROM flink:latest RUN mkdir -p /opt/flink/usrlib COPY ./WordCount.jar /opt/flink/usrlib/WordCount.jar
3、pod的describe信息 Name: my-first-application-cluster-59c4445df4-4ss2m Namespace: default Priority: 0 Node: minikube/192.168.64.2 Start Time: Wed, 23 Dec 2020 17:06:02 +0800 Labels: app=my-first-application-cluster component=jobmanager pod-template-hash=59c4445df4 type=flink-native-kubernetes Annotations: Status: Running IP: 172.17.0.3 IPs: IP: 172.17.0.3 Controlled By: ReplicaSet/my-first-application-cluster-59c4445df4 Containers: flink-job-manager: Container ID: docker://b8e5759488af5fd3e3273f69d42890d9750d430cbd6e18b1d024ab83293d0124 Image: registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1 Image ID: docker-pullable://registry.cn-shenzhen.aliyuncs.com/syni_test/flink@sha256:53a2cec0d0a532aa5d79c241acfdd13accb9df78eb951eb4e878485174186aa8 Ports: 8081/TCP, 6123/TCP, 6124/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Command: /docker-entrypoint.sh Args: native-k8s $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 -Xms1073741824 -XX:MaxMetaspaceSize=268435456 -Dlog.file=/opt/flink/log/jobmanager.log -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint -D jobmanager.memory.off-heap.size=134217728b -D jobmanager.memory.jvm-overhead.min=201326592b -D jobmanager.memory.jvm-metaspace.size=268435456b -D jobmanager.memory.heap.size=1073741824b -D jobmanager.memory.jvm-overhead.max=201326592b State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 127 Started: Wed, 23 Dec 2020 17:37:28 +0800 Finished: Wed, 23 Dec 2020 17:37:28 +0800 Ready: False Restart Count: 11 Limits: cpu: 1 memory: 1600Mi Requests: cpu: 1 memory: 1600Mi Environment: _POD_IP_ADDRESS: (v1:status.podIP) Mounts: /opt/flink/conf from flink-config-volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-9hdqt (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: flink-config-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: flink-config-my-first-application-cluster Optional: false default-token-9hdqt: Type: Secret (a volume populated by a Secret) SecretName: default-token-9hdqt Optional: false QoS Class: Guaranteed Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message
Normal Scheduled 15d default-scheduler Successfully assigned default/my-first-application-cluster-59c4445df4-4ss2m to minikube Normal Pulled 15d kubelet Successfully pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in 513.7913ms Normal Pulled 15d kubelet Successfully pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in 374.1125ms Normal Pulled 15d kubelet Successfully pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in 360.6719ms Normal Created 15d (x4 over 15d) kubelet Created container flink-job-manager Normal Started 15d (x4 over 15d) kubelet Started container flink-job-manager Normal Pulled 15d kubelet Successfully pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in 374.2637ms Normal Pulling 15d (x5 over 15d) kubelet Pulling image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" Warning BackOff 15d (x141 over 15d) kubelet Back-off restarting failed container *来自志愿者整理的flink邮件归档
这个问题的根本原因是你Client端用的是1.12版本,但是你build的镜像的基础镜像是1.11的,因为1.12的镜像还没有发布到docker hub上 你用正确的Dockerfile[1]自己重新build一个,再运行一下看看
[1]. https://github.com/apache/flink-docker/tree/master/1.12/scala_2.12-java8-debian*来自志愿者整理的flink邮件归档
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。