开发者社区> 祝威廉> 正文

StreamingPro

简介: StreamingPro is not a complete application, but rather a extensible and programmable framework for spark streaming (also include spark,storm)that can
+关注继续查看

Declarative workflows for building Spark Streaming

1de7721f4209f17f306f024d216317c55367bc2f
Spark Streaming
Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources.Spark is a extensible and programmable framework for massive distributed processing of datasets,called Resilient Distributed Datasets (RDD). Spark Streaming receives input data streams and divides the data into batches, which are then processed by the Spark engine to generate the results.Spark Streaming data is organized into a sequence of DStreams,represented internally as a sequence of RDDs.

StreamingPro

StreamingPro is not a complete application, but rather  a extensible and programmable framework for spark streaming (also include spark,storm)that can easily be used to build your streaming application.
StreamingPro also make it possible that all you should do to build streaming program is assembling components(eg. SQL Component) in configuration file. 

Features

  • Pure Spark Streaming(Or normal Spark) program (Storm in future)
  • No need of coding, only declarative workflows
  • Rest API for interactive
  • SQL-Oriented workflows support  
  • Data continuously streamed in & processed in near real-time
  • dynamically CURD of workflows  at runtime via Rest API 
  • Flexible workflows (input, output, parsers, etc...) 
  • High performance
  • Scalable   

Documents

Architecture

cfc7ad03f8758fe950f25976c1e140fbc7af0690
Snip20160510_3.png

Declarative workflows

1de7721f4209f17f306f024d216317c55367bc2f
Snip20160510_4.png

Implementation

e7ea91ecaf0f3c5a6a3f0c6288608a460ec1b282
Snip20160510_1.png

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
Debezium Adapt openGauss
Debezium Adapt openGauss
34 0
01-PDI(Kettle)简介与安装
文章目录 01-PDI(Kettle)简介与安装 PDI(Kettle)简介 Kettle安装 Kettle核心知识点
161 0
04-PDI(Kettle)job案例
文章目录 04-PDI(Kettle)job案例 job简介 job创建案例 1.创建空作业
44 0
carbondata1.5.1编译
What is CarbonData? Apache CarbonData is an indexed columnar data format for fast analytics on big data platform, e.g. Apache Hadoop, Apache Spark, etc. 因为我的spark是2.3.1的版本,而最新版的carbondata1.5.1才支持,但是官网没有编译好的,需要我们自己编译,在编译的时候遇到一些问题,记录一下.
56 0
Hive简介及源码编译
Hive是一个基于Hadoop的数据仓库,可以将结构化数据映射成一张表,并提供类SQL的功能,最初由Facebook提供,使用HQL作为查询接口、HDFS作为存储底层、MapReduce作为执行层,设计目的是让SQL技能良好,但Java技能较弱的分析师可以查询海量数据,2008年facebook把Hive项目贡献给Apache。Hive提供了比较完整的SQL功能(本质是将SQL转换为MapReduce),自身最大的缺点就是执行速度慢。Hive有自身的元数据结构描述,可以使用MySql\ProstgreSql\oracle 等关系型数据库来进行存储,但请注意Hive中的所有数据都存储在HDFS中
157 0
Dremio与Drill的对比
1.简述 Dremio与Drill简述 2.区别 a).数据源支持 使用最新版本Dremio 3.3.1和Drill 1.16.0Dremio3.1.3版本开始不支持HBase,将来会开源社区版HBase连接器 b).
2573 0
+关注
文章
问答
文章排行榜
最热
最新
相关电子书
更多
Lego-Like Building Blocks of Storm and Spark Streaming Pipelines
立即下载
Dataflow with Apache NiFi
立即下载
Production-Ready Flink and Hive Integration what story you can tell now
立即下载