Alibaba Cloud DataWorks Highly Recognized by Forrester

简介: DataWorks is listed in Forrester's Cloud Data Warehouse Q1 2018 report as one of the core products from a global first-tier CDW service provider.

Analyses in this article are based on Now Tech: Cloud Data Warehouse, Q1 2018 (Published by Noel Yuhanna, March 13, 2018). The views and opinions expressed herein are those of the author.

On March 13, 2018, Forrester issued the Now Tech: Cloud Data Warehouse Q1 2018 report. In this report, Forrester comprehensively assessed Cloud Data Warehouses (CDWs) in aspects such as main features, regional performance, market segmentation, and customers.

Alibaba Cloud, AWS, Google and Microsoft are selected as the four global first-tier CDW service providers. Alibaba Cloud DataWorks and MaxCompute are the only products from a Chinese company recognized in the report.

In this report, Forrester highlighted four core CDW features:

  • Flexible deployment
    CDWs are expected to have several flexible deployment modes. For small enterprises, CDWs should provide the online multi-tenant mode to allow these customers to quickly mobilize computing resources and implement data warehouse deployment in just several minutes. For medium and large enterprises, CDWs should support the exclusive or local deployment mode to provide robust computing performance and absolute security as well as leave out technical details of high complexity
  • Efficient data migration to cloud
    For customers that have not yet migrated their data warehouses to cloud or customers that adopt online and offline hybrid architectures, CDWs should provide a fast and low-cost approach to help users implement data collection.
  • Diverse analysis methods
    CDWs should support multiple technical means to help users get desired data processing capabilities in various business scenarios.
  • Excellent security
    CDWs should provide security in various aspects, including data encryption, auditing, data desensitization and access control.

As the core of Alibaba Cloud CDW services, why is DataWorks recognized by Forrester? Let's look at the detailed analysis on DataWorks.

Product Architecture

Before analyzing DataWorks, we will first take a quick look at its role in the Alibaba Cloud CDW service system and its product architecture.

1

Among a variety of Alibaba Cloud products, DataWorks and MaxCompute make up the core of CDW service capabilities. As a storage computing engine, MaxCompute is responsible for supporting the IaaS layer and provides users with numerous and reliable big data table storage and SQL execution capability. However, MaxCompute alone cannot meet data processing requirements. Data development, data integration and other CDW services are also required to empower customers with big data. To this end, DataWorks provides a relatively complete solution.

Specifically, DataWorks includes 8 major modules:

  • Data integration: Integrate heterogeneous data to collect numerous data from various source systems on big data cloud platforms
  • Data development: Data warehouse design and ETL development
  • O&M monitoring: O&M monitoring over jobs in the ETL process
  • Real-time analytics: Real-time data exploration and analysis
  • Data asset management: Metadata management, data map, data lineage, data asset graph, etc.
  • Data quality: The system for data quality control, monitoring, verification and assessment
  • Data security: data permission management, classified data marking, data desensitization and data audits
  • Data service: data sharing, data switching and data API services

Flexible Deployment

This Forrester report gives lengthy explanation of the necessity of multiple deployment modes, and includes the comparison among CDWs from several service providers. DataWorks is one of the first-tier products that provide multiple deployment modes.

Serving as the core of the Alibaba Group's data middleware system, DataWorks has been used to support business operations in enterprises like Alibaba Group, Ant Financial, and Cainiao since 2009. If you've used data services provided by Taobao, Tmall, Ant Financial, and other companies, you may have indirectly used the computing service provided by DataWorks.

DataWorks is already available for public cloud users. As of now, DataWorks has provided services for over 4,000 public cloud customers, including Weibo, Renrenche, and Tianhong Asset Management.

DataWorks also supports private cloud. As an important empowering means of big data, DataWorks is utilized in Alibaba Cloud's private cloud solutions including Apsara Enterprise. Since 2015, DataWorks has been providing support for important enterprise and government projects including the Alibaba Cloud ET City Brain and "Easy municipal service access".

With flexible deployment modes, DataWorks can meet a wide variety of customers' needs. For small enterprises, public cloud solutions can be used flexibly to provide services and support; for medium and large enterprises, private cloud or hybrid cloud solutions can fully meet customers' needs.

Efficient Data Migration to the Cloud

It is obvious that efficient data integration methods can significantly facilitate the migration of enterprise data to cloud. During the initial migration stage, enterprises need to quickly and securely migrate their data assets to cloud; during the stage of continuous business operations, enterprises need to input various kinds of data into CDWs and then output processed data from CDWs to individual business units.

The Data Integration feature of DataWorks can be used to read/write multiple data sources, including relational databases, NoSQL databases, big data databases and text storage (FTP), uniformly check data resources in data sources, and synchronize and integrate heterogeneous data sources in complex network environments. As to scheduling a specific import task, DataWorks supports batch synchronization, full synchronization and incremental synchronization of offline data. Users can specify a custom synchronization time by minute, day, hour, week, or month.

2

In addition, the Data Integration feature of DataWorks provides data stream control to manage data stream behavior in dirty data, data velocity and number of concurrent threads, leading to all-round user cost reduction and lean management.

Diverse Analysis Methods

DataWorks provides powerful data development IDEs and supports visual editing of SQL code, integration tasks and business flow DAG graphs. Multi-user online cooperation and task script version management can meet practical needs of enterprise-level data development. In addition to the offline task processing feature, DataWorks provides the lightweight "Analytics Workbench" tool to fully utilize the computing capacity of MaxCompute and meet users' instant data analysis needs.

3

It is reported that updates have recently been made to the drag-and-drop business flow editing feature in DataWorks to further improve user experience and provide a better data development IDE.

Robust Security

Sensitive data protection requires even better compliance with the industry standards and data privacy laws and regulations. Security is the top priority of DataWorks. DataWorks provides data security modules and implements all-round data security using the following security protection means:

  • Multi-tenant isolation
    DataWorks has its own multi-tenant permission model. Tenants can apply for resource quotas on demand and manage their own resources; tenants can also manage their own data, permissions, users and roles independently from each other to ensure data security.
  • Data security level setting
    Data security levels allow users to discover and locate sensitive data, and see the sensitive data distribution on data resource platforms. Auto-discover sensitive data based on specified insensitive data types and classify insensitive data. Appropriate security rules are applied based on secret levels such as Top Secret, Confidential and General.
  • Data access audit
    DataWorks will strictly examine privileged users' access, including access time, executed operations and execution order. Recording and auditing privileged users' access can ensure that appropriate operations are performed at the proper time by these privileged users, and check if abnormal operations are made, to further improve the security of data systems.
  • Data desensitization
    When failing to decide whether some users, access addresses, or even fields are distrustful or not, DataWorks will focus on data content itself, identify sensitive information points and block dynamic access to this information to ensure data security.

DataWorks has received a third-level information security certificate issued by the Ministry of Public Security.

Conclusion

With "Internet Plus" further applied in different industries, there is an increasing need for enterprises to manage, process and employ their data assets. Internet companies can quickly use their big data processing capability to meet other enterprises' needs. That also explains why these four cloud service providers, instead of long-established data warehouse companies like Oracle and IBM, are listed in the Forrester report as first-tier CDW providers.

Thanks to years of data leveraging in Alibaba Cloud, DataWorks can fully meet enterprise-level requirements in deployment modes, data integration, analysis means, and data security.

It is said that DataWorks will continue to provide more advanced data management ideas, including real-time data integration and data asset analysis. DataWorks combines cloud computing with data warehouse management methodology to implement persistent innovations and create "platforms most suitable for big data warehouse development". That is another reason why DataWorks is listed in this Forrester's CDW report.

To learn more about the Big Data capabilities of Alibaba Cloud, read the Forrester report on MaxCompute.

相关实践学习
基于Hologres轻量实时的高性能OLAP分析
本教程基于GitHub Archive公开数据集,通过DataWorks将GitHub中的项⽬、行为等20多种事件类型数据实时采集至Hologres进行分析,同时使用DataV内置模板,快速搭建实时可视化数据大屏,从开发者、项⽬、编程语⾔等多个维度了解GitHub实时数据变化情况。
目录
相关文章
|
SQL 存储 分布式计算
《深度洞察:Hadoop生态系统与SQL的奇妙联动》
Hadoop生态系统如同一座工业城市,包含HDFS、MapReduce、YARN等核心组件,协同处理海量数据。SQL作为经典数据语言,在Hadoop中通过Hive等工具发挥重要作用,降低使用门槛、提升查询效率,并助力数据集成与治理。二者的结合推动了大数据技术发展,未来将在AI、物联网等领域展现更大潜力,持续优化数据处理与分析能力,为科学决策提供有力支持。
328 33
|
机器学习/深度学习 人工智能 文字识别
文档图像智能分析与处理:CCIG 技术论坛的思考与展望
本文记录了 CCIG 技术论坛中关于文档图像智能分析与处理的主要讨论内容。论坛聚焦于文档图像在人工智能领域的广泛应用,并介绍了来自中国科学院、北京大学、中国科学技术大学、华为云和上海合合信息科技的多位专家的演讲和观点。其中,刘成林副所长分享了人工智能大模型时代的文档识别与理解,邹月娴教授介绍了视觉-语言预训练模型及迁移学习方法,谢洪涛教授探讨了篡改文本图像的生成与检测,廖明辉研究员分享了华为云 OCR 技术的进展与行业实践,丁凯高级工程师介绍了智能文档处理技术在工业界的实际应用与挑战。
608 0
|
5天前
|
云安全 人工智能 运维
阿里云SecOps Agent,全新安全跨产品执行体验
自然语言驱动 云安全中心/WAF/CFW/ 等多款安全产品联动
1603 2
|
3天前
|
人工智能 定位技术 SEO
我学 GEO 第 15 天:终于知道AI GEO该如何做?
我是暴走的莉莉酱,边旅行边研究AI GEO的数字游民。专注普通人如何提升“AI可见度”——让AI在回答用户问题时准确识别、理解并推荐你。不讲玄学,只做可测、可调、可持续的GEO实践。
366 124
|
5天前
|
机器学习/深度学习 人工智能 调度
🐴 HappyHorse 1.1 现已上线阿里云百炼!快来查收模型使用指南,现在调用享 6 折~
HappyHorse 1.1 是新一代视频生成大模型,全面升级动态表现力、角色一致性、指令遵循、视觉质感与音画协同能力。支持I2V/T2V/R2V三类生成,适配短剧、电商广告、品牌营销等场景,提供高质、流畅、可控的AI视频生产力。
625 4
🐴 HappyHorse 1.1 现已上线阿里云百炼!快来查收模型使用指南,现在调用享 6 折~
|
3天前
|
缓存 人工智能 运维
阿里云618百炼大模型Qwen3.7-Max功能、免费试用、订阅计费、配置接入详解
Qwen3.7-MAX是阿里云百炼平台推出的通义千问3.7系列旗舰大语言模型,专为智能体时代复杂任务打造,依托阿里云全域算力与自研技术,在逻辑推理、长文本处理、代码工程、长周期自主执行等领域达到行业顶尖水平。2026年618期间,该模型推出多重免费试用权益、按量计费5折、订阅套餐优惠等专属福利,覆盖个人开发者、团队与企业全场景需求,以下从核心功能、免费试用、订阅计费、配置接入四方面展开详细解析。
365 123
|
16天前
|
缓存 测试技术 API
Qwen 3.7 Plus 与 Max 实测:性价比与多模态能力差异解析(2026)
2026 年 6 月 1 日,阿里悄无声息地发布了 Qwen 3.7 Plus,距 Qwen 3.7 Max 上线刚好 11 天。同样的 1M 上下文,同样的 35 小时自治上限。但价格才是头条:Plus 是 0.40/M输入,Max是 2.50/M——便宜约 6 倍——并且还能看图、看视频。Vision Arena 上 Plus 已经排到 #16。所以这周真正值得讨论的问题不是”要不要为视觉能力买单”,而是”Max 凭什么用 6 倍价格换来 2 个百分点的 benchmark 领先”。
|
2天前
|
存储 人工智能 数据可视化
别再手动复制 Skill 了:多 Agent 时代的 Skill 管理方案
多 Agent 场景下 Skill 的统一管理与同步。
186 121