Breakthrough in Alibaba Cloud Computing Capabilities - BigBench Reaches 100 TB World Record

本文涉及的产品
云原生大数据计算服务MaxCompute,500CU*H 100GB 3个月
简介: Alibaba Cloud's BigBench on MaxCompute expands that capacity to 100 TB for the first time in the world, which is also the first benchmark to be based on public cloud services.

In the first day of the 2017 Hangzhou Computing Conference on Oct. 11, Alibaba Cloud President Hu Xiaoming introduced a next-generation computing platform MaxCompute + PAI.
_BigBench_on_MaxCompute_docx

In the main forum on the 12th, Zhou Jingren, Alibaba Group Vice President and director from the Search Division and Computing Platform Division, said that data lays the foundation for artificial intelligence innovation, and possessing plenty of computing capabilities to help fully release the value of the data. Later, Zhou Jingren released BigBench On MaxCompute[1] 2.0 + PAI with Rob Hays, Vice President of Intel's Data Center Division. The release broke the best records set by TPCx-BB[2] and reflected the extremely robust data processing capabilities of MaxCompute and the absolute strength of public cloud compared to the traditional model.
7

At present, the maximum capacity publicized by TPC is 10 TB, the best performance is 1491.23 BBQpm, and the best price/performance ratio is 589 Price/BBQpm. Alibaba Cloud's BigBench on MaxCompute 2.0+PAI expands that capacity to 100 TB for the first time in the world, which is also the first benchmark to be based on public cloud services. Engines running on this platform achieve 7000 points.

It was reported that MaxCompute test environment would be open for one month on public cloud after the conference and that the BigBench On MaxCompute+PAI SDK (inherited from TPCx-BigBench and enabling it to run on the big data environment of Alibaba Cloud) would be open-source for developers to use.

The great capacity breakthrough of BigBench on MaxCompute owes to MaxCompute's mass data processing capabilities and machine learning algorithm efficiency. MaxCompute, based on the Apsara distributed OS developed by Alibaba Cloud, can connect more than 10,000 servers in a single cluster and process Exabytes of data.

MaxCompute next-generation engines get continuous and in-depth performance optimization in the Compiler, Optimizer, and runtime. In addition to high-performance computing, Alibaba Cloud PAI provides users with a robust algorithm experiment platform which includes traditional machine learning as well as the latest in deep learning and enhanced learning. PAI provides a great number of algorithms and tools to meet algorithm requirements in different business scenarios. The platform is also optimized for performance and data capacity.

Furthermore, MaxCompute and Intel processor integration and in-depth optimization enable full use of Intel Xeon® Scalable Processor's structural strengths. Rob Hays, Vice President of Intel's data center division, said "We are delighted to be working with Alibaba Cloud to optimize MaxCompute on the latest Intel® Xeon® Scalable processor platform and to witness the excellent performance of MaxCompute in the BigBench test."

Well, What computing bonuses does BigBench on MaxCompute2.0+PAI bring for developers to help them seize more market opportunities?

  1. Break through the capacity bottleneck. When BigBench data capacity exceeds 10 TB, most products will be bottlenecked and unable to expand. BigBench on MaxCompute enables data capacity to be expanded to 100 TB, which meets users' increasing data capacity requirements.
  2. Lower cost. Conventional hardware + software building mode requires servers. Though the server cost can be apportioned throughout the lifespan of the servers, purchasing hardware means that your future computing resources come at an increased relative cost, as hardware inevitably drops in price year by year. BigBench employs price/QPM to calculate the price/performance ratio. Compared with the conventional hardware mode, MaxCompute supports prepayment and data-based payment, which offers pricing flexibility and competitive price/performance ratios.
  3. Meet scalability requirement. The demand of data on the internet means that an explosion of traffic could happen at any time. The traditional hardware model requires a long adjustment period to meet increased demand. BigBench on MaxCompute enables on-demand computing capacity expansion, satisfying enterprises' capacity expansion requirements at any particular time.
  4. Save O&M workload. Traditionally, a data room needs to be maintained by an O&M team. Usually, the maintenance quality cannot be guaranteed. BigBench on MaxCompute runs on public cloud, saving enterprise customers from investing additional manpower to carry out maintenance.
    BigBench on MaxCompute is modified based on TPCx-BB, so it is compatible with all TPCx-BB semantics. As an industrial benchmark, TPCx-BB covers all operation types of big data processing, including SQL, MapReduce, Streamling, and MachineLearning. The full coverage capability of BigBench on MaxCompute reflects MaxCompute' software stack integrity in big data processing. The following table lists the software stacks of BigBench on MaxCompute:

12345

BigBench on MaxCompute is also an industrial benchmark, which demonstrates the software stack integrity of MaxCompute in big data processing and the superior performance in capacity, cost, and scalability.
BigBench on MaxCompute is very easy to access. Enterprise customers can connect to the platform provided they have prepared:

1.Alibaba Cloud account;
2.BigBench on MaxCompute toolkit;
3.and MaxCompute client.

For details, click MaxCompute Documentations and BigBench on MaxCompute Access Guide.


[1] BigBench on MaxCompute is derived from TPCx-BB, so it is compatible with all TPCx-BB semantics.
[2] TPCx-BB (BigBench) was released by Transaction Processing Performance Council (TPC) in Feb. 2016. First E2E big data analysis app-level benchmark.

相关实践学习
基于MaxCompute的热门话题分析
Apsara Clouder大数据专项技能认证配套课程:基于MaxCompute的热门话题分析
目录
相关文章
|
9月前
|
存储 缓存 人工智能
阿里云Tair KVCache:打造以缓存为中心的大模型Token超级工厂
Tair KVCache 是阿里云推出的面向大语言模型推理场景的缓存加速服务,基于分布式内存池化和分级缓存体系,解决显存墙与带宽瓶颈问题。为万亿参数模型的高效推理提供技术保障,推动 AI 算力进化与规模化应用。
|
机器学习/深度学习 数据采集 数据可视化
TensorFlow,一款由谷歌开发的开源深度学习框架,详细讲解了使用 TensorFlow 构建深度学习模型的步骤
本文介绍了 TensorFlow,一款由谷歌开发的开源深度学习框架,详细讲解了使用 TensorFlow 构建深度学习模型的步骤,包括数据准备、模型定义、损失函数与优化器选择、模型训练与评估、模型保存与部署,并展示了构建全连接神经网络的具体示例。此外,还探讨了 TensorFlow 的高级特性,如自动微分、模型可视化和分布式训练,以及其在未来的发展前景。
1015 5
|
SQL Oracle 关系型数据库
加索引导致表被锁的原因及处理方法
加索引导致表被锁的原因及处理方法
1182 0
|
存储 缓存 弹性计算
重新审视 CXL 时代下的分布式内存
从以太网到 RDMA 再到 CXL,标志着互连技术的重大突破。
|
缓存 并行计算 负载均衡
大模型推理优化实践:KV cache复用与投机采样
在本文中,我们将详细介绍两种在业务中实践的优化策略:多轮对话间的 KV cache 复用技术和投机采样方法。我们会细致探讨这些策略的应用场景、框架实现,并分享一些实现时的关键技巧。
|
存储 Docker 容器
记录Docker搭建私有仓库的步骤教程
记录Docker搭建私有仓库的步骤教程
1035 0
|
存储 Rust 程序员
Rust结构体详解:定义、使用及方法
Rust结构体详解:定义、使用及方法
260 0
|
存储 固态存储 算法
浅析数据中心存储发展趋势
产生的这些巨量数据,存储的归宿在哪里呢?随着发展趋势,个人本地存储的需求越来越小,公有云的数据存储量将会有迅速的攀升。
|
XML Java 程序员
玩转 Android 嵌套滚动
Android 嵌套滚动
335 0
|
移动开发 前端开发 JavaScript
前端祖传三件套HTML的HTML5之新绘画元素 canvas
近年来,前端技术发展迅猛,各种新的技术层出不穷。其中之一便是HTML5绘画元素canvas。Canvas(画布)是HTML5新增加的一个绘图区域,它可以用JavaScript来进行操作和绘制图形。
257 0

热门文章

最新文章