StreamingPro

简介: StreamingPro is not a complete application, but rather a extensible and programmable framework for spark streaming (also include spark,storm)that can

Declarative workflows for building Spark Streaming

1de7721f4209f17f306f024d216317c55367bc2f
Spark Streaming
Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources.Spark is a extensible and programmable framework for massive distributed processing of datasets,called Resilient Distributed Datasets (RDD). Spark Streaming receives input data streams and divides the data into batches, which are then processed by the Spark engine to generate the results.Spark Streaming data is organized into a sequence of DStreams,represented internally as a sequence of RDDs.

StreamingPro

StreamingPro is not a complete application, but rather  a extensible and programmable framework for spark streaming (also include spark,storm)that can easily be used to build your streaming application.
StreamingPro also make it possible that all you should do to build streaming program is assembling components(eg. SQL Component) in configuration file. 

Features

  • Pure Spark Streaming(Or normal Spark) program (Storm in future)
  • No need of coding, only declarative workflows
  • Rest API for interactive
  • SQL-Oriented workflows support  
  • Data continuously streamed in & processed in near real-time
  • dynamically CURD of workflows  at runtime via Rest API 
  • Flexible workflows (input, output, parsers, etc...) 
  • High performance
  • Scalable   

Documents

Architecture

cfc7ad03f8758fe950f25976c1e140fbc7af0690
Snip20160510_3.png

Declarative workflows

1de7721f4209f17f306f024d216317c55367bc2f
Snip20160510_4.png

Implementation

e7ea91ecaf0f3c5a6a3f0c6288608a460ec1b282
Snip20160510_1.png
目录
相关文章
|
SQL HIVE
71 Azkaban HIVE脚本任务
71 Azkaban HIVE脚本任务
63 0
|
6月前
|
SQL Java Maven
hive-3.0.0源码编译详解
hive-3.0.0源码编译详解
58 0
|
SQL 分布式计算 Hadoop
Hive on Tez 的安装配置
Hive on Tez 的安装配置
567 0
Hive on Tez 的安装配置
|
分布式计算 JavaScript Java
Oozie的安装和使用
Oozie的安装和使用
|
分布式计算 资源调度 Java
spark on yarn模式安装和配置carbondata
前置条件 Hadoop HDFS 和 Yarn 需要安装和运行。 Spark 需要在所有的集群节点上安装并且运行。 CarbonData 用户需要有权限访问 HDFS. 以下步骤仅针对于 Driver 程序所在的节点. (Driver 节点就是启动 SparkContext 的节点)
|
SQL 分布式计算 Hadoop
Hive on Spark安装配置详解
本文主要记录如何安装配置Hive on Spark,并列举遇到的坑及解决办法。
6684 1
|
SQL Apache 分布式计算
Apache Carbondata on Preto
1.download apache carbondata - 1.5.3apache carbondata - 1.5.4apache spark - 2.3.2apache hadoop - 2.7.
1254 0