使用 Flink Hudi 构建流式数据湖

本文涉及的产品
实时计算 Flink 版,5000CU*H 3个月
简介: 本文作者陈玉兆,介绍了 Flink Hudi 通过流计算对原有基于 mini-batch 的增量计算模型的不断优化演进。

本文介绍了 Flink Hudi 通过流计算对原有基于 mini-batch 的增量计算模型不断优化演进。用户可以通过 Flink SQL 将 CDC 数据实时写入 Hudi 存储,且在即将发布的 0.9 版本 Hudi 原生支持 CDC format。主要内容为:

  1. 背景
  2. 增量 ETL
  3. 演示

GitHub 地址
https://github.com/apache/flink
欢迎大家给 Flink 点赞送 star~

一、背景

近实时

从 2016 年开始,Apache Hudi 社区就开始通过 Hudi 的 UPSERT 能力探索近实时场景的使用案例 [1]。通过 MR/Spark 的批处理模型,用户可以实现小时级别的数据注入 HDFS/OSS。在纯实时场景,用户通过流计算引擎 Flink + KV/OLAP 存储的架构可以实现端到端的秒级 (5分钟级) 实时分析。然而在秒级 (5分钟级) 到小时级时的场景还存在大量的用例,我们称之为 NEAR-REAL-TIME (近实时)。

img

在实践中有大量的案例都属于近实时的范畴:

  1. 分钟级别的大屏;
  2. 各种 BI 分析 (OLAP);
  3. 机器学习分钟级别的特征提取。

增量计算

解决近实时的方案当前是比较开放的。

  • 流处理的时延低,但是 SQL 的 pattern 比较固定,查询端的能力(索引、ad hoc)欠缺;
  • 批处理的数仓能力丰富但是数据时延大。

于是 Hudi 社区提出基于 mini-batch 的增量计算模型:

增量数据集 => 增量计算结果 merge 已存结果 => 外存

这套模型通过湖存储的 snapshot 拉取增量的数据集 (两个 commits 之前的数据集),通过 Spark/Hive 等批处理框架计算增量的结果 (比如简单的 count) 再 merge 到已存结果中。

核心问题

增量模型需要解决的核心问题:

  1. UPSERT 能力:类似 KUDU 和 Hive ACID,Hudi 也提供了分钟级的更新能力;
  2. 增量消费:Hudi 通过湖存储的多 snapshots 提供增量拉取。

基于 mini-batch 的增量计算模型可以提升部分场景的时延、节省计算成本,但有一个很大的限制:对 SQL 的 pattern 有要求。因为计算走的是批,批计算本身不维护状态,这就要求计算的指标能够比较方便地 merge,简单的 count、sum 可以做,但是 avg、count distinct 这些还是需要拉取全量数据重算。

随着流计算和实时数仓的普及,Hudi 社区也在积极的拥抱变化,通过流计算对原有基于 mini-batch 的增量计算模型不断优化演进:在 0.7 版本引入了流式数据入湖,在 0.9 版本支持了原生的 CDC format。

二、增量 ETL

DB 数据入湖

随着 CDC 技术的成熟,debezium 这样的 CDC 工具越来越流行,Hudi 社区也先后集成了流写,流读的能力。用户可以通过 Flink SQL 将 CDC 数据实时写入 Hudi 存储:

img

  • 用户既可以通过 Flink CDC connector 直接将 DB 数据导入 Hudi;
  • 也可以先将 CDC 数据导入 Kafka,再通过 Kafka connector 导入 Hudi。

第二种方案的容错和扩展性会好一些。

数据湖 CDC

在即将发布的 0.9 版本,Hudi 原生支持 CDC format,一条 record 的所有变更记录都可以保存,基于此,Hudi 和流计算系统结合的更加完善,可以流式读取 CDC 数据 [2]:

img

源头 CDC 流的所有消息变更都在入湖之后保存下来,被用于流式消费。Flink 的有状态计算实时累加计算结果 (state),通过流式写 Hudi 将计算的变更同步到 Hudi 湖存储,之后继续对接 Flink 流式消费 Hudi 存储的 changelog, 实现下一层级的有状态计算。近实时端到端 ETL pipeline:

img

这套架构将端到端的 ETL 时延缩短到分钟级,并且每一层的存储格式都可以通过 compaction 压缩成列存(Parquet、ORC)以提供 OLAP 分析能力,由于数据湖的开放性,压缩后的格式可以对接各种查询引擎:Flink、Spark、Presto、Hive 等。

一张 Hudi 数据湖表具备两种形态:

  • 表形态:查询最新的快照结果,同时提供高效的列存格式
  • 流形态:流式消费变更,可以指定任意点位流读之后的 changelog

三、演示

我们通过一段 Demo 演示 Hudi 表的两种形态。

环境准备

  • Flink SQL Client
  • Hudi master 打包 hudi-flink-bundle jar
  • Flink 1.13.1

这里提前准备一段 debezium-json 格式的 CDC 数据

{"before":null,"after":{"id":101,"ts":1000,"name":"scooter","description":"Small 2-wheel scooter","weight":3.140000104904175},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606100,"transaction":null}
{"before":null,"after":{"id":102,"ts":2000,"name":"car battery","description":"12V car battery","weight":8.100000381469727},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":103,"ts":3000,"name":"12-pack drill bits","description":"12-pack of drill bits with sizes ranging from #40 to #3","weight":0.800000011920929},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":104,"ts":4000,"name":"hammer","description":"12oz carpenter's hammer","weight":0.75},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":105,"ts":5000,"name":"hammer","description":"14oz carpenter's hammer","weight":0.875},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":106,"ts":6000,"name":"hammer","description":"16oz carpenter's hammer","weight":1},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":107,"ts":7000,"name":"rocks","description":"box of assorted rocks","weight":5.300000190734863},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":108,"ts":8000,"name":"jacket","description":"water resistent black wind breaker","weight":0.10000000149011612},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":null,"after":{"id":109,"ts":9000,"name":"spare tire","description":"24 inch spare tire","weight":22.200000762939453},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":0,"snapshot":"true","db":"inventory","table":"products","server_id":0,"gtid":null,"file":"mysql-bin.000003","pos":154,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1589355606101,"transaction":null}
{"before":{"id":106,"ts":6000,"name":"hammer","description":"16oz carpenter's hammer","weight":1},"after":{"id":106,"ts":10000,"name":"hammer","description":"18oz carpenter hammer","weight":1},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589361987000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":362,"row":0,"thread":2,"query":null},"op":"u","ts_ms":1589361987936,"transaction":null}
{"before":{"id":107,"ts":7000,"name":"rocks","description":"box of assorted rocks","weight":5.300000190734863},"after":{"id":107,"ts":11000,"name":"rocks","description":"box of assorted rocks","weight":5.099999904632568},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362099000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":717,"row":0,"thread":2,"query":null},"op":"u","ts_ms":1589362099505,"transaction":null}
{"before":null,"after":{"id":110,"ts":12000,"name":"jacket","description":"water resistent white wind breaker","weight":0.20000000298023224},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362210000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":1068,"row":0,"thread":2,"query":null},"op":"c","ts_ms":1589362210230,"transaction":null}
{"before":null,"after":{"id":111,"ts":13000,"name":"scooter","description":"Big 2-wheel scooter ","weight":5.179999828338623},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362243000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":1394,"row":0,"thread":2,"query":null},"op":"c","ts_ms":1589362243428,"transaction":null}
{"before":{"id":110,"ts":12000,"name":"jacket","description":"water resistent white wind breaker","weight":0.20000000298023224},"after":{"id":110,"ts":14000,"name":"jacket","description":"new water resistent white wind breaker","weight":0.5},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362293000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":1707,"row":0,"thread":2,"query":null},"op":"u","ts_ms":1589362293539,"transaction":null}
{"before":{"id":111,"ts":13000,"name":"scooter","description":"Big 2-wheel scooter ","weight":5.179999828338623},"after":{"id":111,"ts":15000,"name":"scooter","description":"Big 2-wheel scooter ","weight":5.170000076293945},"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362330000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":2090,"row":0,"thread":2,"query":null},"op":"u","ts_ms":1589362330904,"transaction":null}
{"before":{"id":111,"ts":16000,"name":"scooter","description":"Big 2-wheel scooter ","weight":5.170000076293945},"after":null,"source":{"version":"1.1.1.Final","connector":"mysql","name":"dbserver1","ts_ms":1589362344000,"snapshot":"false","db":"inventory","table":"products","server_id":223344,"gtid":null,"file":"mysql-bin.000003","pos":2443,"row":0,"thread":2,"query":null},"op":"d","ts_ms":1589362344455,"transaction":null}

通过 Flink SQL Client 创建表用来读取 CDC 数据文件

Flink SQL> CREATE TABLE debezium_source(
>   id INT NOT NULL,
>   ts BIGINT,
>   name STRING,
>   description STRING,
>   weight DOUBLE
> ) WITH (
>   'connector' = 'filesystem',
>   'path' = '/Users/chenyuzhao/workspace/hudi-demo/source.data',
>   'format' = 'debezium-json'
> );
[INFO] Execute statement succeed.

执行 SELECT 观察结果,可以看到一共有 20 条记录,中间有一些 UPDATE s,最后一条消息是 DELETE

Flink SQL> select * from debezium_source;
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| op |          id |                   ts |                           name |                    description |                         weight |
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| +I |         101 |                 1000 |                        scooter |          Small 2-wheel scooter |              3.140000104904175 |
| +I |         102 |                 2000 |                    car battery |                12V car battery |              8.100000381469727 |
| +I |         103 |                 3000 |             12-pack drill bits | 12-pack of drill bits with ... |              0.800000011920929 |
| +I |         104 |                 4000 |                         hammer |        12oz carpenter's hammer |                           0.75 |
| +I |         105 |                 5000 |                         hammer |        14oz carpenter's hammer |                          0.875 |
| +I |         106 |                 6000 |                         hammer |        16oz carpenter's hammer |                            1.0 |
| +I |         107 |                 7000 |                          rocks |          box of assorted rocks |              5.300000190734863 |
| +I |         108 |                 8000 |                         jacket | water resistent black wind ... |            0.10000000149011612 |
| +I |         109 |                 9000 |                     spare tire |             24 inch spare tire |             22.200000762939453 |
| -U |         106 |                 6000 |                         hammer |        16oz carpenter's hammer |                            1.0 |
| +U |         106 |                10000 |                         hammer |          18oz carpenter hammer |                            1.0 |
| -U |         107 |                 7000 |                          rocks |          box of assorted rocks |              5.300000190734863 |
| +U |         107 |                11000 |                          rocks |          box of assorted rocks |              5.099999904632568 |
| +I |         110 |                12000 |                         jacket | water resistent white wind ... |            0.20000000298023224 |
| +I |         111 |                13000 |                        scooter |           Big 2-wheel scooter  |              5.179999828338623 |
| -U |         110 |                12000 |                         jacket | water resistent white wind ... |            0.20000000298023224 |
| +U |         110 |                14000 |                         jacket | new water resistent white w... |                            0.5 |
| -U |         111 |                13000 |                        scooter |           Big 2-wheel scooter  |              5.179999828338623 |
| +U |         111 |                15000 |                        scooter |           Big 2-wheel scooter  |              5.170000076293945 |
| -D |         111 |                16000 |                        scooter |           Big 2-wheel scooter  |              5.170000076293945 |
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
Received a total of 20 rows

创建 Hudi 表,这里设置表的形态为 MERGE_ON_READ 并且打开 changelog 模式属性 changelog.enabled

Flink SQL> CREATE TABLE hoodie_table(
>   id INT NOT NULL PRIMARY KEY NOT ENFORCED,
>   ts BIGINT,
>   name STRING,
>   description STRING,
>   weight DOUBLE
> ) WITH (
>   'connector' = 'hudi',
>   'path' = '/Users/chenyuzhao/workspace/hudi-demo/t1',
>   'table.type' = 'MERGE_ON_READ',
>   'changelog.enabled' = 'true',
>   'compaction.async.enabled' = 'false'
> );
[INFO] Execute statement succeed.

查询

通过 INSERT 语句将数据导入 Hudi,开启流读模式,并执行查询观察结果

Flink SQL> select * from hoodie_table/*+ OPTIONS('read.streaming.enabled'='true')*/;
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| op |          id |                   ts |                           name |                    description |                         weight |
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| +I |         101 |                 1000 |                        scooter |          Small 2-wheel scooter |              3.140000104904175 |
| +I |         102 |                 2000 |                    car battery |                12V car battery |              8.100000381469727 |
| +I |         103 |                 3000 |             12-pack drill bits | 12-pack of drill bits with ... |              0.800000011920929 |
| +I |         104 |                 4000 |                         hammer |        12oz carpenter's hammer |                           0.75 |
| +I |         105 |                 5000 |                         hammer |        14oz carpenter's hammer |                          0.875 |
| +I |         106 |                 6000 |                         hammer |        16oz carpenter's hammer |                            1.0 |
| +I |         107 |                 7000 |                          rocks |          box of assorted rocks |              5.300000190734863 |
| +I |         108 |                 8000 |                         jacket | water resistent black wind ... |            0.10000000149011612 |
| +I |         109 |                 9000 |                     spare tire |             24 inch spare tire |             22.200000762939453 |
| -U |         106 |                 6000 |                         hammer |        16oz carpenter's hammer |                            1.0 |
| +U |         106 |                10000 |                         hammer |          18oz carpenter hammer |                            1.0 |
| -U |         107 |                 7000 |                          rocks |          box of assorted rocks |              5.300000190734863 |
| +U |         107 |                11000 |                          rocks |          box of assorted rocks |              5.099999904632568 |
| +I |         110 |                12000 |                         jacket | water resistent white wind ... |            0.20000000298023224 |
| +I |         111 |                13000 |                        scooter |           Big 2-wheel scooter  |              5.179999828338623 |
| -U |         110 |                12000 |                         jacket | water resistent white wind ... |            0.20000000298023224 |
| +U |         110 |                14000 |                         jacket | new water resistent white w... |                            0.5 |
| -U |         111 |                13000 |                        scooter |           Big 2-wheel scooter  |              5.179999828338623 |
| +U |         111 |                15000 |                        scooter |           Big 2-wheel scooter  |              5.170000076293945 |
| -D |         111 |                16000 |                        scooter |           Big 2-wheel scooter  |              5.170000076293945 |

可以看到 Hudi 保留了每行的变更记录,包括 change log 的 operation 类型,这里我们打开 TABLE HINTS 功能,方便动态设置表参数。

继续使用 batch 读模式,执行查询观察输出结果,可以看到中间的变更被合并。

Flink SQL> select * from hoodie_table;
2021-08-20 20:51:25,052 INFO  org.apache.hadoop.conf.Configuration.deprecation             [] - mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| op |          id |                   ts |                           name |                    description |                         weight |
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
| +U |         110 |                14000 |                         jacket | new water resistent white w... |                            0.5 |
| +I |         101 |                 1000 |                        scooter |          Small 2-wheel scooter |              3.140000104904175 |
| +I |         102 |                 2000 |                    car battery |                12V car battery |              8.100000381469727 |
| +I |         103 |                 3000 |             12-pack drill bits | 12-pack of drill bits with ... |              0.800000011920929 |
| +I |         104 |                 4000 |                         hammer |        12oz carpenter's hammer |                           0.75 |
| +I |         105 |                 5000 |                         hammer |        14oz carpenter's hammer |                          0.875 |
| +U |         106 |                10000 |                         hammer |          18oz carpenter hammer |                            1.0 |
| +U |         107 |                11000 |                          rocks |          box of assorted rocks |              5.099999904632568 |
| +I |         108 |                 8000 |                         jacket | water resistent black wind ... |            0.10000000149011612 |
| +I |         109 |                 9000 |                     spare tire |             24 inch spare tire |             22.200000762939453 |
+----+-------------+----------------------+--------------------------------+--------------------------------+--------------------------------+
Received a total of 10 rows

聚合

Bounded Source 读模式下计算 count(*)

Flink SQL> select count (*) from hoodie_table;
+----+----------------------+
| op |               EXPR$0 |
+----+----------------------+
| +I |                    1 |
| -U |                    1 |
| +U |                    2 |
| -U |                    2 |
| +U |                    3 |
| -U |                    3 |
| +U |                    4 |
| -U |                    4 |
| +U |                    5 |
| -U |                    5 |
| +U |                    6 |
| -U |                    6 |
| +U |                    7 |
| -U |                    7 |
| +U |                    8 |
| -U |                    8 |
| +U |                    9 |
| -U |                    9 |
| +U |                   10 |
+----+----------------------+
Received a total of 19 rows

Streaming 读模式下计算 count(*)

Flink SQL> select count (*) from hoodie_table/*+OPTIONS('read.streaming.enabled'='true')*/;
+----+----------------------+
| op |               EXPR$0 |
+----+----------------------+
| +I |                    1 |
| -U |                    1 |
| +U |                    2 |
| -U |                    2 |
| +U |                    3 |
| -U |                    3 |
| +U |                    4 |
| -U |                    4 |
| +U |                    5 |
| -U |                    5 |
| +U |                    6 |
| -U |                    6 |
| +U |                    7 |
| -U |                    7 |
| +U |                    8 |
| -U |                    8 |
| +U |                    9 |
| -U |                    9 |
| +U |                    8 |
| -U |                    8 |
| +U |                    9 |
| -U |                    9 |
| +U |                    8 |
| -U |                    8 |
| +U |                    9 |
| -U |                    9 |
| +U |                   10 |
| -U |                   10 |
| +U |                   11 |
| -U |                   11 |
| +U |                   10 |
| -U |                   10 |
| +U |                   11 |
| -U |                   11 |
| +U |                   10 |
| -U |                   10 |
| +U |                   11 |
| -U |                   11 |
| +U |                   10 |

可以看到 batch 和 streaming 模式下的计算结果是一致的。

当前的数据湖 CDC format 还处在快速迭代期,社区也在积极推动生产场景,对 Hudi 场景和案例感兴趣的同学可以扫码加群。

img

Reference

[1] https://www.oreilly.com/content/ubers-case-for-incremental-processing-on-hadoop/

[2] https://hudi.apache.org/blog/2021/07/21/streaming-data-lake-platform


第三届 Apache Flink 极客挑战赛报名开始!
30 万奖金等你来!

伴随着海量数据的冲击,数据处理分析能力在业务中的价值与日俱增,各行各业对于数据处理时效性的探索也在不断深入,作为主打实时计算的计算引擎 - Apache Flink 应运而生。

为给行业带来更多实时计算赋能实践的思路,鼓励广大热爱技术的开发者加深对 Flink 的掌握,Apache Flink 社区联手阿里云、英特尔、阿里巴巴人工智能治理与可持续发展实验室 (AAIG)、Occlum 联合举办 "第三届 Apache Flink 极客挑战赛暨 AAIG CUP" 活动,即日起正式启动。

👉 点击了解更多赛事信息 👈

img


更多 Flink 相关技术问题,可扫码加入社区钉钉交流群
第一时间获取最新技术文章和社区动态,请关注公众号~

image.png

活动推荐

阿里云基于 Apache Flink 构建的企业级产品-实时计算Flink版现开启活动:
99 元试用 实时计算Flink版(包年包月、10CU)即有机会获得 Flink 独家定制T恤;另包 3 个月及以上还有 85 折优惠!
了解活动详情:https://www.aliyun.com/product/bigdata/sc

image.png

相关实践学习
基于Hologres轻松玩转一站式实时仓库
本场景介绍如何利用阿里云MaxCompute、实时计算Flink和交互式分析服务Hologres开发离线、实时数据融合分析的数据大屏应用。
Linux入门到精通
本套课程是从入门开始的Linux学习课程,适合初学者阅读。由浅入深案例丰富,通俗易懂。主要涉及基础的系统操作以及工作中常用的各种服务软件的应用、部署和优化。即使是零基础的学员,只要能够坚持把所有章节都学完,也一定会受益匪浅。
相关文章
|
2月前
|
数据采集 存储 分布式计算
构建智能数据湖:DataWorks助力企业实现数据驱动转型
【8月更文第25天】本文将详细介绍如何利用阿里巴巴云的DataWorks平台构建一个智能、灵活、可扩展的数据湖存储体系,以帮助企业实现数据驱动的业务转型。我们将通过具体的案例和技术实践来展示DataWorks如何集成各种数据源,并通过数据湖进行高级分析和挖掘,最终基于数据洞察驱动业务增长和创新。
204 53
|
3月前
|
存储 搜索推荐 数据建模
阿里巴巴大数据实践之数据建模:构建企业级数据湖
阿里巴巴通过构建高效的数据湖和实施先进的数据建模策略,实现了数据驱动的业务增长。这些实践不仅提升了内部运营效率,也为客户提供了更好的服务体验。随着数据量的不断增长和技术的不断创新,阿里巴巴将持续优化其数据建模方法,以适应未来的变化和发展。
|
3月前
|
存储 JSON Kubernetes
实时计算 Flink版操作报错合集之 写入hudi时报错,该如何排查
在使用实时计算Flink版过程中,可能会遇到各种错误,了解这些错误的原因及解决方法对于高效排错至关重要。针对具体问题,查看Flink的日志是关键,它们通常会提供更详细的错误信息和堆栈跟踪,有助于定位问题。此外,Flink社区文档和官方论坛也是寻求帮助的好去处。以下是一些常见的操作报错及其可能的原因与解决策略。
|
3月前
|
Java 关系型数据库 MySQL
实时计算 Flink版操作报错合集之同步tidb到hudi报错,一般是什么原因
在使用实时计算Flink版过程中,可能会遇到各种错误,了解这些错误的原因及解决方法对于高效排错至关重要。针对具体问题,查看Flink的日志是关键,它们通常会提供更详细的错误信息和堆栈跟踪,有助于定位问题。此外,Flink社区文档和官方论坛也是寻求帮助的好去处。以下是一些常见的操作报错及其可能的原因与解决策略。
|
3月前
|
分布式计算 数据处理 流计算
实时计算 Flink版产品使用问题之使用Spark ThriftServer查询同步到Hudi的数据时,如何实时查看数据变化
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
4月前
|
SQL JSON 数据库
实时计算 Flink版操作报错合集之写入Hudi时,遇到从 COW(Copy-On-Write)表类型转换为 MOR(Merge-On-Read)表类型时报字段错误,该怎么办
在使用实时计算Flink版过程中,可能会遇到各种错误,了解这些错误的原因及解决方法对于高效排错至关重要。针对具体问题,查看Flink的日志是关键,它们通常会提供更详细的错误信息和堆栈跟踪,有助于定位问题。此外,Flink社区文档和官方论坛也是寻求帮助的好去处。以下是一些常见的操作报错及其可能的原因与解决策略。
|
SQL 存储 Java
Hudi on Flink 快速上手指南
本文由阿里巴巴的陈玉兆分享,主要介绍 Flink 集成 Hudi 的最新版本功能以及快速上手实践指南。
Hudi on Flink 快速上手指南
|
1月前
|
运维 数据处理 数据安全/隐私保护
阿里云实时计算Flink版测评报告
该测评报告详细介绍了阿里云实时计算Flink版在用户行为分析与标签画像中的应用实践,展示了其毫秒级的数据处理能力和高效的开发流程。报告还全面评测了该服务在稳定性、性能、开发运维及安全性方面的卓越表现,并对比自建Flink集群的优势。最后,报告评估了其成本效益,强调了其灵活扩展性和高投资回报率,适合各类实时数据处理需求。
|
3月前
|
存储 监控 大数据
阿里云实时计算Flink在多行业的应用和实践
本文整理自 Flink Forward Asia 2023 中闭门会的分享。主要分享实时计算在各行业的应用实践,对回归实时计算的重点场景进行介绍以及企业如何使用实时计算技术,并且提供一些在技术架构上的参考建议。
779 7
阿里云实时计算Flink在多行业的应用和实践
|
2月前
|
SQL 消息中间件 Kafka
实时计算 Flink版产品使用问题之如何在EMR-Flink的Flink SOL中针对source表单独设置并行度
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。

相关产品

  • 实时计算 Flink版