Flink-01 介绍Flink Java 3分钟上手 HelloWorld 和 Stream ExecutionEnvironment DataSet FlatMapFunction

本文涉及的产品
实时计算 Flink 版,5000CU*H 3个月
简介: Flink-01 介绍Flink Java 3分钟上手 HelloWorld 和 Stream ExecutionEnvironment DataSet FlatMapFunction

代码仓库

会同步代码到 GitHub

https://github.com/turbo-duck/flink-demo

基本介绍

Flink介绍

Apache Flink 是一个开源的流处理框架,旨在处理批处理和实时数据处理,具有高吞吐量和低延迟的特点。由 Apache 软件基金会开发,Flink 以其强大的流分析、复杂事件处理和数据驱动应用而闻名。

Flink特点

流处理:Flink 将批处理视为流处理的一种特殊情况。这种方法允许实时数据处理,实现即时的洞察和行动。

有状态计算:Flink 提供强大的状态管理,使得在处理流的过程中可以保持状态。这一特性对于需要容错和一致性的应用至关重要。

事件时间处理:Flink 允许用户基于事件时间来处理数据,即使数据无序到达,也能提供准确及时的结果。

容错性:Flink 的状态管理和检查点机制确保系统在出现故障时能够恢复而不丢失状态,维护数据完整性和应用一致性。

高吞吐量和低延迟:Flink 的架构优化了高吞吐量和低延迟,适合高性能应用。

可扩展性:Flink 可以扩展到数千个节点,能够处理大规模数据处理任务。

灵活的部署选项:Flink 可以部署在各种环境中,包括独立集群、云环境和容器编排平台(如 Kubernetes)。

Flink场景

实时分析:Flink 非常适合需要实时数据处理和分析的应用,例如监控和告警系统。

数据管道处理:Flink 可以用于构建数据管道,实时处理和转换数据,确保数据始终是最新的。

事件驱动应用:需要实时响应事件的应用,例如欺诈检测系统和推荐引擎,可以利用 Flink 的实时处理能力。

复杂事件处理:Flink 能够处理复杂事件处理场景,需要关联和分析多个事件。

Flink架构

Job Manager:Job Manager 负责管理 Flink 应用的生命周期,包括调度任务、协调检查点和故障恢复。

Task Managers:Task Managers 是执行实际数据处理任务的工作节点,管理任务执行和资源分配。

客户端:客户端用于将 Flink 任务提交给 Job Manager,可以是命令行界面、网页界面或 API。

状态后台:状态后台存储 Flink 应用的状态,支持不同的存储选项,包括内存、文件系统和分布式存储系统(如 Apache Hadoop 和 Amazon S3)。

检查点和保存点:Flink 使用检查点定期快照应用的状态。保存点是手动触发的快照,可用于版本控制和回滚。

初始工程

建立一个新的JavaMaven工程

pom.xml

https://mvnrepository.com/search?q=flink-java

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>flink-demo-01</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <flink.version>1.13.2</flink.version> <!-- 确保版本号正确 -->
        <scala.binary.version>2.12</scala.binary.version> <!-- 确保Scala版本正确 -->
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>
</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
        </dependency>
    </dependencies>
</project>

Demo01.java

package icu.wzk.demo01;

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.util.Collector;
import org.apache.flink.api.java.tuple.Tuple2;


public class StartApp {

    public static void main(String[] args) throws Exception {
        String inPath = "demo01/file.txt";
        String outputPath = "demo01/result.csv";
        ExecutionEnvironment executionEnvironment = ExecutionEnvironment.getExecutionEnvironment();
        DataSet<String> text = executionEnvironment.readTextFile(inPath);
        DataSet<Tuple2<String, Integer>> dataSet = text
                .flatMap(new LineSplitter())
                .groupBy(0)
                .sum(1);
        dataSet
                .writeAsCsv(outputPath,"\n"," ", FileSystem.WriteMode.OVERWRITE)
                .setParallelism(1);
        executionEnvironment.execute("file.txt -> result.csv");
    }

    static class LineSplitter implements FlatMapFunction<String, Tuple2<String,Integer>> {
        @Override
        public void flatMap(String line, Collector<Tuple2<String, Integer>> collector) throws Exception {
            for (String word : line.split(" ")) {
                collector.collect(new Tuple2<>(word,1));
            }
        }
    }
}

文本样例

这里提供了一个简单的文本供大家测试,详细的目录信息如下图:

文本的内容如下:

The narrator introduces himself as a man who learned when he was a child that adults lack imagination and understanding. He is now a pilot who has crash-landed in a desert. He encounters a small boy who asks him for a drawing of a sheep, and the narrator obliges. The narrator, who calls the child the little prince, learns that the boy comes from a very small planet, which the narrator believes to be asteroid B-612. Over the course of the next few days, the little prince tells the narrator about his life. On his asteroid-planet, which is no bigger than a house, the prince spends his time pulling up baobab seedlings, lest they grow big enough to engulf the tiny planet. One day an anthropomorphic rose grows on the planet, and the prince loves her with all his heart. However, her vanity and demands become too much for the prince, and he leaves.
The prince travels to a series of asteroids, each featuring a grown-up who has been reduced to a function. The first is a king who requires obedience but has no subjects until the arrival of the prince. The sole inhabitant of the next planet is a conceited man who wants nothing from the prince but flattery. The prince subsequently meets a drunkard, who explains that he must drink to forget how ashamed he is of drinking. The fourth planet introduces the prince to a businessman, who maintains that he owns the stars, whih makes it very important that he know exactly how many stars there are. The prince then encounters a lamplighter, who follows orders that require him to light a lamp each evening and put it out each morning, even though his planet spins so fast that dusk and dawn both occur once every minute. Finally the prince comes to a planet inhabited by a geographer. The geographer, however, knows nothing of his own planet, because it is his sole function to record what he learns from explorers. He asks the prince to describe his home planet, but when the prince mentions the flower, the geographer says that flowers are not recorded because they are ephemeral. The geographer recommends that the little prince visit Earth.
On Earth the prince meets a snake, who says that he can return him to his home, and a flower, who tells him that people lack roots. He comes across a rose garden, and he finds it very depressing to learn that his beloved rose is not, as she claimed, unique in the universe. A fox then tells him that if he tames the fox—that is, establishes ties with the fox—then they will be unique and a source of joy to each other.
The narrator and little prince have now spent eight days in the desert and have run out of water. The two then traverse the desert in search of a well, which, miraculously, they find. The little prince tells the narrator that he plans to return that night to his planet and flower and that now the stars will be meaningful to the narrator, because he will know that his friend is living on one of them. Returning to his planet requires allowing the poisonous snake to bite him. The story resumes six years later. The narrator says that the prince’s body was missing in the morning, so he knows that he returned to his planet, and he wonders whether the sheep that he drew him ate his flower. He ends by imploring the reader to contact him if they ever spot the little prince.

运行 demo01 结果(部分)

and 15
are. 1
baobab 1
be 3
bigger 1
body 1
calls 1
days, 1
demands 1
describe 1
drinking. 1
drunkard, 1
enough 1
featuring 1
function 1
know 2
knows 2
learns 2
little 6

相关实践学习
基于Hologres+Flink搭建GitHub实时数据大屏
通过使用Flink、Hologres构建实时数仓,并通过Hologres对接BI分析工具(以DataV为例),实现海量数据实时分析.
实时计算 Flink 实战课程
如何使用实时计算 Flink 搞定数据处理难题?实时计算 Flink 极客训练营产品、技术专家齐上阵,从开源 Flink功能介绍到实时计算 Flink 优势详解,现场实操,5天即可上手! 欢迎开通实时计算 Flink 版: https://cn.aliyun.com/product/bigdata/sc Flink Forward Asia 介绍: Flink Forward 是由 Apache 官方授权,Apache Flink Community China 支持的会议,通过参会不仅可以了解到 Flink 社区的最新动态和发展计划,还可以了解到国内外一线大厂围绕 Flink 生态的生产实践经验,是 Flink 开发者和使用者不可错过的盛会。 去年经过品牌升级后的 Flink Forward Asia 吸引了超过2000人线下参与,一举成为国内最大的 Apache 顶级项目会议。结合2020年的特殊情况,Flink Forward Asia 2020 将在12月26日以线上峰会的形式与大家见面。
目录
相关文章
|
7天前
|
存储 Java API
Java Stream API:现代数据处理之道
Java Stream API:现代数据处理之道
173 92
|
7天前
|
存储 Java API
Java Stream API:现代数据处理之道
Java Stream API:现代数据处理之道
125 68
|
2月前
|
Oracle Java 关系型数据库
掌握Java Stream API:高效集合处理的利器
掌握Java Stream API:高效集合处理的利器
328 80
|
2月前
|
安全 Java API
Java 8 Stream API:高效集合处理的利器
Java 8 Stream API:高效集合处理的利器
217 83
|
3月前
|
SQL JSON 安全
Java 8 + 中 Lambda 表达式与 Stream API 的应用解析
摘要:本文介绍了Java 8+核心新特性,包括Lambda表达式与Stream API的集合操作(如过滤统计)、函数式接口的自定义实现、Optional类的空值安全处理、接口默认方法与静态方法的扩展能力,以及Java 9模块化系统的组件管理。每个特性均配有典型应用场景和代码示例,如使用Stream统计字符串长度、Optional处理Map取值、模块化项目的依赖声明等,帮助开发者掌握现代Java的高效编程范式。(150字)
54 1
|
3月前
|
存储 Java 大数据
Java代码优化:for、foreach、stream使用法则与性能比较
总结起来,for、foreach和stream各自都有其适用性和优势,在面对不同的情况时,有意识的选择更合适的工具,能帮助我们更好的解决问题。记住,没有哪个方法在所有情况下都是最优的,关键在于理解它们各自的特性和适用场景。
329 23
|
2月前
|
SQL 人工智能 Rust
Java 开发中Stream的toMap与Map 使用技巧
本文深入解析了 Java 中 `toMap()` 方法的三大问题:重复键抛出异常、`null` 值带来的风险以及并行流中的性能陷阱,并提供了多种替代方案,如使用 `groupingBy`、`toConcurrentMap` 及自定义收集器,帮助开发者更安全高效地进行数据处理。
128 0
|
3月前
|
Java 调度 流计算
基于Java 17 + Spring Boot 3.2 + Flink 1.18的智慧实验室管理系统核心代码
这是一套基于Java 17、Spring Boot 3.2和Flink 1.18开发的智慧实验室管理系统核心代码。系统涵盖多协议设备接入(支持OPC UA、MQTT等12种工业协议)、实时异常检测(Flink流处理引擎实现设备状态监控)、强化学习调度(Q-Learning算法优化资源分配)、三维可视化(JavaFX与WebGL渲染实验室空间)、微服务架构(Spring Cloud构建分布式体系)及数据湖建设(Spark构建实验室数据仓库)。实际应用中,该系统显著提升了设备调度效率(响应时间从46分钟降至9秒)、设备利用率(从41%提升至89%),并大幅减少实验准备时间和维护成本。
219 0
|
9月前
|
存储 Java 数据挖掘
Java 8 新特性之 Stream API:函数式编程风格的数据处理范式
Java 8 引入的 Stream API 提供了一种新的数据处理方式,支持函数式编程风格,能够高效、简洁地处理集合数据,实现过滤、映射、聚合等操作。
258 6
|
9月前
|
Rust 安全 Java
Java Stream 使用指南
本文介绍了Java中Stream流的使用方法,包括如何创建Stream流、中间操作(如map、filter、sorted等)和终结操作(如collect、forEach等)。此外,还讲解了并行流的概念及其可能带来的线程安全问题,并给出了示例代码。
657 0

热门文章

最新文章