备案控制台

开发者社区大数据文章正文

193 DStream相关操作 - Output Operations on DStreams

2023-11-01 21

版权

版权声明：

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 193 DStream相关操作 - Output Operations on DStreams

Output Operations可以将DStream的数据输出到外部的数据库或文件系统，当某个Output Operations原语被调用时（与RDD的Action相同），streaming程序才会开始真正的计算过程。

Output Operation	Meaning
print()	Prints the first ten elements of every batch of data in a DStream on the driver node running the streaming application. This is useful for development and debugging.
saveAsTextFiles(prefix, [suffix])	Save this DStream’s contents as text files. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
saveAsObjectFiles(prefix, [suffix])	Save this DStream’s contents as SequenceFiles of serialized Java objects. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
saveAsHadoopFiles(prefix, [suffix])	Save this DStream’s contents as Hadoop files. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
foreachRDD(func)	The most generic output operator that applies a function, func, to each RDD generated from the stream. This function should push the data in each RDD to an external system, such as saving the RDD to files, or writing it over the network to a database. Note that the function func is executed in the driver process running the streaming application, and will usually have RDD actions in it that will force the computation of the streaming RDDs.

文章标签：

分布式计算

流计算

Java

Hadoop

数据库

阿甘兄

目录

相关文章

阿甘兄

|

机器学习/深度学习分布式计算 API

192 DStream相关操作 - Transformations on DStreams

192 DStream相关操作 - Transformations on DStreams

阿甘兄

33 0 0

阿里云社区

|

分布式计算算法大数据

Rdd 算子_转换_sample | 学习笔记

快速学习 Rdd 算子_转换_sample

阿里云社区

152 0 0

Rdd 算子_转换_sample | 学习笔记

秦超峰

|

分布式计算 Spark

spark常用的Transformations 和Actions

spark常用的Transformations 和Actions

秦超峰

216 0 0

云祁

|

分布式计算 Spark

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）2

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）2

云祁

155 0 0

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）2

云祁

|

分布式计算算法 Hadoop

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）1

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）1

云祁

439 0 0

【Spark】（五）Spark Transformation（转换算子）和 Action（执行算子）1

秦超峰

|

分布式计算 Java 5G

spark异常：missing an output location for shuffle 0

spark异常：missing an output location for shuffle 0

秦超峰

497 0 0

6hkip34i2a7n6

|

分布式计算 Spark

SPARK push-based shuffle mapTask是怎么获取ESS列表信息

SPARK push-based shuffle mapTask是怎么获取ESS列表信息

6hkip34i2a7n6

230 0 0

小生凡一

|

分布式计算

RDD的 transformations 和 actions 总结

RDD的transformations和actions 两个RDD：一个RDD包含 {1, 2, 3} , 另一个RDD包含{3, 4, 5}

小生凡一

100 0 0

RDD的 transformations 和 actions 总结

祝威廉

|

分布式计算 Java Spark

Spark Tungsten-sort Based Shuffle 分析

Tungsten-sort 算不得一个全新的shuffle 方案，它在特定场景下基于类似现有的Sort Based Shuffle处理流程，对内存/CPU/Cache使用做了非常大的优化。带来高效的同时，也就限定了自己的使用场景。如果Tungsten-sort 发现自己无法处理，则会自动使用 Sor

祝威廉

3754 0 1

技术小牛人

|

分布式计算 Spark

Spark操作—aggregate、aggregateByKey详解

技术小牛人

2425 0 0

热门文章

最新文章

阿里Java代码规约插件即将全球首发，邀您来发布仪式现场

淘宝 NPM 镜像

MySQL数据表生成ER图

Object reference not set to an instance of an object.

锐捷网络&视易K米，携手助力2016腾讯全球合作伙伴万人大会

WF4.0实战(九)：猜数字游戏，测下你的智力

数据结构复习笔记(6)

OutOfMemoryError与StackOverFlowError

Percona XtraDB Cluster 集群环境建立与验证指南

在没有ISA管控的时候

小程序数据绑定机制的优点

小程序的 API 做了什么处理，能够做到全局变量的隐藏

在 Babel 插件中使用 TypeScript 类型

Kotlin - 函数与Lambda表达式

Kotlin - 区间与数组

Kotlin - 运算符与中缀表达式

babel 原理，怎么写 babel 插件

Kotlin - 面向对象之抽象类与接口

Kotlin - 参数与异常

Kotlin - 分支与循环

相关电子书

更多

A stream processing pipeline S

Custom applications with Spark’s RDD

Dynamic DDL Adding Structure to Streaming Data on the Fly

下一篇

阿里云无影云电脑免费试用，最长可试用3个月