大数据情报第四期(2018-07-16)-阿里云开发者社区

开发者社区> 阿里云实时计算Flink> 正文
登录阅读全文

大数据情报第四期(2018-07-16)

简介: 《使用Kafka Streams构建事件溯源系统的经验分享》近期在乌克兰基辅举行的JEEConf大会上,Amitay Horwitz介绍了他的团队是如何实现一个事件溯源的发票系统、系统两年半生产环境运行期间所遇到的挑战,以及团队是如何使用Kafka Streams实现新的设计。

《使用Kafka Streams构建事件溯源系统的经验分享》近期在乌克兰基辅举行的JEEConf大会上,Amitay Horwitz介绍了他的团队是如何实现一个事件溯源的发票系统、系统两年半生产环境运行期间所遇到的挑战,以及团队是如何使用Kafka Streams实现新的设计。

《使用Apache Kafka和KSQL实现普及化流处理》大多数的流处理技术,需要开发人员使用Java或Scala等编程语言编写代码。KSQL是Apache Kafka的数据流SQL引擎,它使用SQL语句替代编写大量代码去实现流处理任务。KSQL基于Kafka的Stream API构建,它支持过滤、转换、聚合、连接、加窗操作和Sessionization(即捕获单一会话期间的所有的流事件)等流处理操作。KSQL的用例涉及实现实时报表和仪表盘、基础设施和物联网设备监控、异常检测和欺骗行为报警等。

《谷歌Kubernetes引擎上的GPU现已普遍可用》谷歌宣布可在Kubernetes引擎(GKE)中普遍使用GPU。与最近发布的1.10正式版GKE一起,用户可以将机器学习(ML)工作负载放在上面,并利用GPU的强大处理能力。

《Announcing the general availability of Azure SQL Data Sync》We are delighted to announce the general availability (GA) of Azure SQL Data Sync! Azure SQL Data Sync allows you to synchronize data between Azure SQL Database and any other SQL endpoints unidirectionally or bidirectionally. It enables hybrid SQL deployment and allows local data access from both Azure and on-premises application. It also allows you to deploy your data applications globally with a local copy of data in each region, and keep data synchronized across all the regions. It will significantly improve the application response time and reliability by eliminating the impact of network latency and connection failure rate.

《Apache Flink 1.5.1 Released》The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.This release includes more than 60 fixes and minor improvements for Flink 1.5.0. The list below includes a detailed list of all fixes.

《June Preview Release: Packing Confluent Platform with the Features You Requested!》We are very excited to announce the Confluent Platform June 2018 Preview. This is our most feature-packed preview release for Confluent Platform since we started doing our monthly preview releases in April 2018.

《New Azure innovation advances customer success for the cloud- and AI-powered future》Organizations around the world are gearing up for a future powered by the intelligent cloud and AI. As these technologies become increasingly central to business strategy and transformation, Microsoft is committed to delivering cutting-edge innovations, programs and expertise that help our customers navigate these technological and business shifts.

《Azure sets new performance benchmarks with SQL Data Warehouse》As the amount of data grows exponentially, the pressure to quickly harness it for insights to share across the organization also increases rapidly. As Microsoft continues to evolve our analytics portfolio, we are committed to delivering a data warehouse solution that provides a fast, flexible, and secure analytics platform in the cloud.

《Kafka 1.0 on HDInsight lights up real time analytics scenarios》Data engineers love Kafka on HDInsight as a high-throughput, low-latency ingestion platform in their real time data pipeline. They already leverage Kafka features such as message compressionconfigurable retention policy, and log compaction. With the release of Apache Kafka 1.0 on HDInsight, customers now get key features that make it easy to implement the most demanding scenarios.

《Using Apache Spark DStreams with Cloud Dataproc and Cloud Pub/Sub》Apache Spark offers two APIs for streaming: the original Discretized Streams API, or DStreams, and the more recent Structured Streaming API, which was released as an alpha in Spark 2.0 and as a stable release in Spark 2.2. While Structured Streaming offers several new, important features like event time operations and the Datasets and DataFrames abstractions, it also has some limitations. For example, Structured Streaming does not yet support operations such as sorting or multiple streaming aggregations.

《A one size fits all database doesn't fit anyone》Seldom can one database fit the needs of multiple distinct use cases. The days of the one-size-fits-all monolithic database are behind us, and developers are now building highly distributed applications using a multitude of purpose-built databases. Developers are doing what they do best: breaking complex applications into smaller pieces and then picking the best tool to solve each problem. The best tool for a job usually differs by use case.

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:

一套基于Apache Flink构建的一站式、高性能实时大数据处理平台,广泛适用于流式数据处理、离线数据处理、DataLake计算等场景。

官方博客
链接