开发者学堂课程【《实时计算 Flink 版中级课程》:如何使用实时计算 e2e 搭建实时数仓(上)+(下)】学习笔记,与课程紧密联系,让用户快速学习知识。
课程地址:https://developer.aliyun.com/learning/course/806/detail/13883
如何使用实时计算 e2e 搭建实时数仓(上)+(下)
一、Ververica Platform In Action
A Practical Demo of Build Streaming Data Pipelines
with Apache Flink SQL on Top of Ververica Platform
1. Pain Points of Building Streaming Pipelines
Dev & Trouble-shooting Efficiency
·Juggling with message queues, engines & data storages
·Lack of REPL(read-eval-print loop) env to perform test against a single snippet of code . Each pipeline depends on the readiness of upstream pipelines Performance Tuning & Rescaling Flink applications
·Unpredictable data spikes
Flexibility of Scaling Cluster
·High cost of maintenance & hard to scale
2.Why Ververica Platform
Highly Integrated with Apache Flink
· Keep pace with Flink's new features,E.q. support utilizing Hive Metastore as an external catalog in WP (since v2.3.0), deploying session clusters (since v2.3.1)
· Out-of-box support for Apache Flink's
connectors with enhanced functionalities
·Optimized Flink runtime brings significant performance boosts
3.Why Ververica Platform
·Easy-use Features
·Preview SOL script results at any time
·Sample+ mock data to debug easily
· DDL templates, SQL auto-suggestions & script visualization to speed up dev efficiency
·"Autopilot"job performance to optimal against
the specified resource usage
4.Why Ververica Platform
Scaling Cluster As Needed
· Server-less architecture on top of Alibaba Cloud
· Pay-as-you-go & Save-when-you-reserve
如何使用实时计算 e2e 搭建实时数仓(下)
二、.Demo Introduction
·Overview of Pipelines Construction
· Source/Dim/Sink Preparation
1.Hive Sink Preparation
2. RDS Dim & Sink Preparatio