代码仓库
会同步代码到 GitHub
https://github.com/turbo-duck/flink-demo
基本介绍
Flink介绍
Apache Flink 是一个开源的流处理框架,旨在处理批处理和实时数据处理,具有高吞吐量和低延迟的特点。由 Apache 软件基金会开发,Flink 以其强大的流分析、复杂事件处理和数据驱动应用而闻名。
Flink特点
流处理:Flink 将批处理视为流处理的一种特殊情况。这种方法允许实时数据处理,实现即时的洞察和行动。
有状态计算:Flink 提供强大的状态管理,使得在处理流的过程中可以保持状态。这一特性对于需要容错和一致性的应用至关重要。
事件时间处理:Flink 允许用户基于事件时间来处理数据,即使数据无序到达,也能提供准确及时的结果。
容错性:Flink 的状态管理和检查点机制确保系统在出现故障时能够恢复而不丢失状态,维护数据完整性和应用一致性。
高吞吐量和低延迟:Flink 的架构优化了高吞吐量和低延迟,适合高性能应用。
可扩展性:Flink 可以扩展到数千个节点,能够处理大规模数据处理任务。
灵活的部署选项:Flink 可以部署在各种环境中,包括独立集群、云环境和容器编排平台(如 Kubernetes)。
Flink场景
实时分析:Flink 非常适合需要实时数据处理和分析的应用,例如监控和告警系统。
数据管道处理:Flink 可以用于构建数据管道,实时处理和转换数据,确保数据始终是最新的。
事件驱动应用:需要实时响应事件的应用,例如欺诈检测系统和推荐引擎,可以利用 Flink 的实时处理能力。
复杂事件处理:Flink 能够处理复杂事件处理场景,需要关联和分析多个事件。
Flink架构
Job Manager:Job Manager 负责管理 Flink 应用的生命周期,包括调度任务、协调检查点和故障恢复。
Task Managers:Task Managers 是执行实际数据处理任务的工作节点,管理任务执行和资源分配。
客户端:客户端用于将 Flink 任务提交给 Job Manager,可以是命令行界面、网页界面或 API。
状态后台:状态后台存储 Flink 应用的状态,支持不同的存储选项,包括内存、文件系统和分布式存储系统(如 Apache Hadoop 和 Amazon S3)。
检查点和保存点:Flink 使用检查点定期快照应用的状态。保存点是手动触发的快照,可用于版本控制和回滚。
初始工程
建立一个新的JavaMaven工程
pom.xml
https://mvnrepository.com/search?q=flink-java
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.example</groupId> <artifactId>flink-demo-01</artifactId> <version>1.0-SNAPSHOT</version> <properties> <maven.compiler.source>8</maven.compiler.source> <maven.compiler.target>8</maven.compiler.target> <flink.version>1.13.2</flink.version> <!-- 确保版本号正确 --> <scala.binary.version>2.12</scala.binary.version> <!-- 确保Scala版本正确 --> </properties> <dependencies> <dependency> <groupId>org.apache.flink</groupId> <artifactId> </artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> </dependencies> </project>
Demo01.java
package icu.wzk.demo01; import org.apache.flink.api.common.functions.FlatMapFunction; import org.apache.flink.api.java.DataSet; import org.apache.flink.api.java.ExecutionEnvironment; import org.apache.flink.core.fs.FileSystem; import org.apache.flink.util.Collector; import org.apache.flink.api.java.tuple.Tuple2; public class StartApp { public static void main(String[] args) throws Exception { String inPath = "demo01/file.txt"; String outputPath = "demo01/result.csv"; ExecutionEnvironment executionEnvironment = ExecutionEnvironment.getExecutionEnvironment(); DataSet<String> text = executionEnvironment.readTextFile(inPath); DataSet<Tuple2<String, Integer>> dataSet = text .flatMap(new LineSplitter()) .groupBy(0) .sum(1); dataSet .writeAsCsv(outputPath,"\n"," ", FileSystem.WriteMode.OVERWRITE) .setParallelism(1); executionEnvironment.execute("file.txt -> result.csv"); } static class LineSplitter implements FlatMapFunction<String, Tuple2<String,Integer>> { @Override public void flatMap(String line, Collector<Tuple2<String, Integer>> collector) throws Exception { for (String word : line.split(" ")) { collector.collect(new Tuple2<>(word,1)); } } } }
文本样例
这里提供了一个简单的文本供大家测试,详细的目录信息如下图:
文本的内容如下:
The narrator introduces himself as a man who learned when he was a child that adults lack imagination and understanding. He is now a pilot who has crash-landed in a desert. He encounters a small boy who asks him for a drawing of a sheep, and the narrator obliges. The narrator, who calls the child the little prince, learns that the boy comes from a very small planet, which the narrator believes to be asteroid B-612. Over the course of the next few days, the little prince tells the narrator about his life. On his asteroid-planet, which is no bigger than a house, the prince spends his time pulling up baobab seedlings, lest they grow big enough to engulf the tiny planet. One day an anthropomorphic rose grows on the planet, and the prince loves her with all his heart. However, her vanity and demands become too much for the prince, and he leaves. The prince travels to a series of asteroids, each featuring a grown-up who has been reduced to a function. The first is a king who requires obedience but has no subjects until the arrival of the prince. The sole inhabitant of the next planet is a conceited man who wants nothing from the prince but flattery. The prince subsequently meets a drunkard, who explains that he must drink to forget how ashamed he is of drinking. The fourth planet introduces the prince to a businessman, who maintains that he owns the stars, whih makes it very important that he know exactly how many stars there are. The prince then encounters a lamplighter, who follows orders that require him to light a lamp each evening and put it out each morning, even though his planet spins so fast that dusk and dawn both occur once every minute. Finally the prince comes to a planet inhabited by a geographer. The geographer, however, knows nothing of his own planet, because it is his sole function to record what he learns from explorers. He asks the prince to describe his home planet, but when the prince mentions the flower, the geographer says that flowers are not recorded because they are ephemeral. The geographer recommends that the little prince visit Earth. On Earth the prince meets a snake, who says that he can return him to his home, and a flower, who tells him that people lack roots. He comes across a rose garden, and he finds it very depressing to learn that his beloved rose is not, as she claimed, unique in the universe. A fox then tells him that if he tames the fox—that is, establishes ties with the fox—then they will be unique and a source of joy to each other. The narrator and little prince have now spent eight days in the desert and have run out of water. The two then traverse the desert in search of a well, which, miraculously, they find. The little prince tells the narrator that he plans to return that night to his planet and flower and that now the stars will be meaningful to the narrator, because he will know that his friend is living on one of them. Returning to his planet requires allowing the poisonous snake to bite him. The story resumes six years later. The narrator says that the prince’s body was missing in the morning, so he knows that he returned to his planet, and he wonders whether the sheep that he drew him ate his flower. He ends by imploring the reader to contact him if they ever spot the little prince.
运行 demo01
结果(部分)
and 15 are. 1 baobab 1 be 3 bigger 1 body 1 calls 1 days, 1 demands 1 describe 1 drinking. 1 drunkard, 1 enough 1 featuring 1 function 1 know 2 knows 2 learns 2 little 6