Flink CDC里为什么我的 flink job 一直卡在 DEPLOYING 不动啊？

Flink CDC里为什么我的 flink job 一直卡在 DEPLOYING 不动啊？
2024-01-23 01:52:44,015 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Registered job manager 00000000000000000000000000000000@pekko.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_2 for job 04ea74dd688e3908309b92f28207761a.
2024-01-23 01:52:44,018 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000.
2024-01-23 01:52:44,028 INFO org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] - Received resource requirements from job 04ea74dd688e3908309b92f28207761a: [ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}]
2024-01-23 01:52:44,098 INFO org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] - Matching resource requirements against available resources.
Missing resources:
Job 04ea74dd688e3908309b92f28207761a
ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1}
Current resources:
TaskManager 10.42.0.120:38895-bfc5cd
Available: ResourceProfile{cpuCores=1, taskHeapMemory=384.000mb (402653174 bytes), taskOffHeapMemory=0 bytes, managedMemory=512.000mb (536870920 bytes), networkMemory=128.000mb (134217730 bytes)}
Total: ResourceProfile{cpuCores=1, taskHeapMemory=384.000mb (402653174 bytes), taskOffHeapMemory=0 bytes, managedMemory=512.000mb (536870920 bytes), networkMemory=128.000mb (134217730 bytes)}
2024-01-23 01:52:44,105 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DefaultSlotStatusSyncer [] - Starting allocation of slot 678a4196ca38b40c863833981baeb169 from 10.42.0.120:38895-bfc5cd for job 04ea74dd688e3908309b92f28207761a with resource profile ResourceProfile{cpuCores=1, taskHeapMemory=384.000mb (402653174 bytes), taskOffHeapMemory=0 bytes, managedMemory=512.000mb (536870920 bytes), networkMemory=128.000mb (134217730 bytes)}.
2024-01-23 01:52:44,364 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: job[1] -> DropUpdateBefore[2] -> ConstraintEnforcer[3] -> Sink: skill_upp_table_job[3] (1/1) (0d41fb8db7f44cfb64adb97f8248b1c7_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from SCHEDULED to DEPLOYING.
2024-01-23 01:52:44,374 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying Source: job[1] -> DropUpdateBefore[2] -> ConstraintEnforcer[3] -> Sink: skill_upp_table_job[3] (1/1) (attempt #0) with attempt id 0d41fb8db7f44cfb64adb97f8248b1c7_cbc357ccb763df2852fee8c4fc7d55f2_0_0 and vertex id cbc357ccb763df2852fee8c4fc7d55f2_0 to 10.42.0.120:38895-bfc5cd @ ip-10-42-0-120.ap-southeast-1.compute.internal (dataPort=40729) with allocation id 678a4196ca38b40c863833981baeb169
2024-01-23 01:52:45,698 INFO com.ververica.cdc.connectors.mysql.source.assigners.MySqlSnapshotSplitAssigner [] - Split table skill_upp.job into 1 chunks, time cost: 1701ms.

根据您提供的信息，Flink CDC作业在部署过程中卡住的原因可能是资源不足。从日志中可以看到，作业需要1个slot，但当前可用的资源只有1个TaskManager，且其可用资源与作业所需的资源不匹配。

要解决这个问题，您可以尝试以下方法：

增加TaskManager的数量：通过增加TaskManager的数量，可以为作业提供更多的资源，从而解决资源不足的问题。
调整作业的资源需求：根据您的实际情况，可以适当调整作业的资源需求，例如减少所需的slot数量或降低内存需求。
优化作业代码：检查作业代码，看是否有可以优化的地方，以提高作业的执行效率，从而减少对资源的占用。
调整Flink集群的配置：根据实际情况，可以调整Flink集群的配置，例如增加TaskManager的内存大小或调整slot分配策略等，以满足作业的资源需求。

Flink CDC里为什么我的 flink job 一直卡在 DEPLOYING 不动啊？

实时计算 Flink

相关文章

热门讨论

热门文章