我观察到一个现象:我定义了一个tumble window,调用一个python udf,在这个udf里面使用requests发送rest api。 log显示这个udf会被调用两次。相隔不到一秒。这个是什么原因?requests库跟beam冲突了?
2020-07-09 17:44:17,501 INFO flink_test_stream_time_kafka.py:22 [] - start to ad 2020-07-09 17:44:17,530 INFO flink_test_stream_time_kafka.py:63 [] - start to send rest api. 2020-07-09 17:44:17,532 INFO flink_test_stream_time_kafka.py:69 [] - Receive: {"Received": "successful"} 2020-07-09 17:44:17,579 INFO /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:564 [] - Creating insecure state channel for localhost:57954. 2020-07-09 17:44:17,580 INFO /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py:571 [] - State channel established. 2020-07-09 17:44:17,584 INFO /home/sysadmin/miniconda3/lib/python3.7/site-packages/apache_beam/runners/worker/data_plane.py:526 [] - Creating client data channel for localhost:60902 2020-07-09 17:44:17,591 INFO org.apache.beam.runners.fnexecution.data.GrpcDataService [] - Beam Fn Data client connected. 2020-07-09 17:44:17,761 INFO flink_test_stream_time_kafka.py:22 [] - start to ad 2020-07-09 17:44:17,810 INFO flink_test_stream_time_kafka.py:63 [] - start to send rest api. 2020-07-09 17:44:17,812 INFO flink_test_stream_time_kafka.py:69 [] - Receive: {"Received": "successful"}
*来自志愿者整理的flink邮件归档
Table API的作业在执行之前会经过一系列的rule优化,最终的执行计划,存在一个UDF调用多次的可能,你可以把执行计划打印出来看看(TableEnvironment#explain)。*来自志愿者整理的flink邮件归档
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。