193 DStream相关操作 - Output Operations on DStreams

简介: 193 DStream相关操作 - Output Operations on DStreams

Output Operations可以将DStream的数据输出到外部的数据库或文件系统,当某个Output Operations原语被调用时(与RDD的Action相同),streaming程序才会开始真正的计算过程。

Output Operation Meaning
print() Prints the first ten elements of every batch of data in a DStream on the driver node running the streaming application. This is useful for development and debugging.
saveAsTextFiles(prefix, [suffix]) Save this DStream’s contents as text files. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
saveAsObjectFiles(prefix, [suffix]) Save this DStream’s contents as SequenceFiles of serialized Java objects. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
saveAsHadoopFiles(prefix, [suffix]) Save this DStream’s contents as Hadoop files. The file name at each batch interval is generated based on prefix and suffix: “prefix-TIME_IN_MS[.suffix]”.
foreachRDD(func) The most generic output operator that applies a function, func, to each RDD generated from the stream. This function should push the data in each RDD to an external system, such as saving the RDD to files, or writing it over the network to a database. Note that the function func is executed in the driver process running the streaming application, and will usually have RDD actions in it that will force the computation of the streaming RDDs.
目录
相关文章
|
机器学习/深度学习 分布式计算 API
192 DStream相关操作 - Transformations on DStreams
192 DStream相关操作 - Transformations on DStreams
33 0
|
分布式计算 算法 大数据
Rdd 算子_转换_sample | 学习笔记
快速学习 Rdd 算子_转换_sample
152 0
Rdd 算子_转换_sample | 学习笔记
|
分布式计算 Spark
spark常用的Transformations 和Actions
spark常用的Transformations 和Actions
216 0
|
分布式计算 Spark
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)2
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)2
155 0
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)2
|
分布式计算 算法 Hadoop
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)1
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)1
439 0
【Spark】(五)Spark Transformation(转换算子) 和 Action(执行算子)1
|
分布式计算 Java 5G
spark异常:missing an output location for shuffle 0
spark异常:missing an output location for shuffle 0
497 0
|
分布式计算 Spark
SPARK push-based shuffle mapTask是怎么获取ESS列表信息
SPARK push-based shuffle mapTask是怎么获取ESS列表信息
230 0
|
分布式计算
RDD的 transformations 和 actions 总结
RDD的transformations和actions 两个RDD:一个RDD包含 {1, 2, 3} , 另一个RDD包含{3, 4, 5}
100 0
RDD的 transformations 和 actions 总结
|
分布式计算 Java Spark
Spark Tungsten-sort Based Shuffle 分析
Tungsten-sort 算不得一个全新的shuffle 方案,它在特定场景下基于类似现有的Sort Based Shuffle处理流程,对内存/CPU/Cache使用做了非常大的优化。带来高效的同时,也就限定了自己的使用场景。如果Tungsten-sort 发现自己无法处理,则会自动使用 Sor
3754 0