Implementing a Highly-Compressed Data Storage

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
简介: Alibaba Cloud ApsaraDB for RDS for MySQL supports the TokuDB engine to store data that is compressed to 5 to 10 times smaller than its original size.

ST_007

1. MySQL Big Data Storage - Compression up to 10 times smaller

Alibaba Cloud ApsaraDB for RDS for MySQL supports the TokuDB engine to store data that is compressed to 5 to 10 times smaller than its original size. It also supports highly concurrent writes through the caching of intermediate nodes.

TokuDB is an optional storage engine for RDS for MySQL. With a disk-optimized index structure Fractal Tree, TokuDB's intermediate nodes can cache data processing requests (insert/update/delete/on-line add index/on-line add column), improving the performance of high-concurrency writes by three to nine times. The node size is 4 MB (configurable), and data can be compressed by 5 to 10 times through a variety of compression algorithms such as zlib/quicklz/lzma/zstd/snappy. TokuDB also supports multiple versions of MVCC and the four isolation levels UR, RC, RR, and Serializable.

In addition to features found in the Community Edition, the source code team for RDS for MySQL also implemented a number of customized optimizations to respond to common use cases:

• Hot backup
TokuDB Community Edition does not provide hot backup. RDS for MySQL enables a hot backup solution based on the TokuDB internal checkpoint mechanism by copying the binlog, redo log, and data files. RDS for MySQL also obtains the TokuDB checkpoint lock to prevent the sharp checkpoint issue during the hot backup process.

• Improve query response on the client
The source code version provided by RDS limits TokuDB to performing sharp checkpoint once every second to avoid using too much disk bandwidth, which can result in query response failures on the client.

• Set buffer pool ratio
RDS provides a parameter tokudb_buffer_pool_ratio for you to set the percentage of memory occupied by the TokuDB engine buffer pool within the range [0,100]. The minimum TokuDB buffer pool size is 64 MB, and the minimum InnoDB buffer pool size is 64 MB (V5.6) and 128 MB (V5.7) to meet the InnoDB/TokuDB initialization requirements.

Switch RDS for MySQL to the TokuDB engine in three steps

1.Set the "loose_tokudb_buffer_pool_ratio", that is, the proportion of the TokuDB and InnoDB shared cache occupied by TokuDB.

select sum(data_length) into @all_size from information_schema.tables where engine='innodb';

select sum(data_length) into @change_size from information_schema.tables where engine='innodb' and concat(table_schema, '.', table_name) in ('XX.XXXX', 'XX.XXXX', 'XX.XXXX');

select round(@change_size/@all_size*100);

2.Restart the instance.

3.Alter the storage engine.

ALTER TABLE XX.XXXX ENGINE=TokuDB

Specifically, "XXX.XXXX" refers to the name of the database and table to be altered to the TokuDB storage engine.

2. MySQL MaxCompute

Alibaba Cloud provides the MaxCompute service for storage and calculation of batch structured data. The service offers mass data warehouse solutions and analytical modeling services for big data. You can import RDS data into MaxCompute through the Data Integration service and simple settings on the interface to achieve large-scale data computing.

29_1

相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
Java API
解决办法:access restriction is not accessible due to restriction
解决办法:access restriction is not accessible due to restriction
136 0
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
This whitepaper will examine Alibaba Cloud’s Cainiao smart logistics cloud and Big Data powered platform and the underlying strategies used to optimiz.
1544 0
The Rising Smart Logistics Industry: How to Use Big Data to Improve Efficiency and Save Costs
|
网络协议 Linux 块存储
|
网络协议 Linux 虚拟化
|
NoSQL Java 中间件
Understanding Data Caching
Caching is an efficient and easy way to capture interactions between your application and the data storage location.
2029 0