Best Practices for RDS MySQL Storage Optimization

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS PostgreSQL,集群系列 2核4GB
简介: This article talks about spatial optimization of ApsaraDB for RDS MySQL. We will discuss the various objects which comprise the RDS Console with an understanding of the composition of objects.

Original article: https://yq.aliyun.com/articles/55594

Bucket storage is critical to running Alibaba Cloud ApsaraDB for RDS. In fact, bucket storage is among the top 5 topics we see in support tickets. When the actual volume used exceeds the storage purchased for the instance, the instance will be locked. It will prevent the applications from writing or updating data, which usually results in application errors.

You can set warning thresholds for storage in the ApsaraDB for RDS console. When the used storage on an instance reaches the warning threshold, you will receive a warning message. At this time, you may need to find out whether or not the increase in used storage is reasonable. If so, you should perform an elastic upgrade on the instance.

Local Upgrade vs. Cross-Host Upgrade

It should be noted that an elastic upgrade may take one of two forms: a local upgrade/degrade or a cross-host upgrade. In the former, the disk storage of the host where the instance is located is sufficient to accommodate the upgrade. In this situation, the upgrade is swift and will not influence your application.

However, in the latter, the remaining disk storage of the host where the instance is present is not sufficient to accommodate the upgrade, which means that the instance needs migration to another host with enough disk resources. In this case, the upgrade can take quite a while depending on how much storage is required by the instance.

Upgrade time gets further extended by the fact that the data needs to be both recovered and backed up onto the new host and data must be synchronized between the old and new instances. Finally, all existing connections in the database get broken when the upgrade is complete (Note: a high-security access link does not have this issue). If the increase in storage seems unreasonable, then you have to make a quick decision on how to proceed.

Therefore, we need to understand the items that take up RDS storage. There are five storage types in the RDS console, namely total disk storage, data storage, log storage, temporary file storage, and system file storage. In the following sections, we will look at four of the five storage types that typically causes confusion to RDS users.

Data Files

Data files refer to the files stored in the storage. In the database, they are tables. The tables are mainly composed of data and indexes. Therefore, when you find that your data files are occupying a great deal of instance storage, you should further check which table is most responsible. You can see which table is holding the most system storage through data dictionaries:

select TABLE_SCHEMA,TABLE_NAME,INDEX_LENGTH/1024/1024 as index_M,DATA_LENGTH/1024/1024 as data_M from TABLES order by (INDEX_LENGTH+DATA_LENGTH) desc limit 10

Preparation prevents poor performance, so we should take certain precautions against excessive storage use starting from the application design phase:

  • The size of disk space depends on potential data growth trends;
  • Whether to delete or archive data depends on the data retention period;
  • When designing tables, select reasonable data types, field sizes, and storage engines for sharding and table sharding.

Let's take the example in the figure below. As we can see, data storage occupies a significant chunk of the total storage on the instance. You can see which tables in the database are utilizing the most space with the above method:

1

Common scenarios:

1.The storage does not get released even after table data gets deleted.

Best practice: Reconstruction of the table is needed by optimizing the table. This method may cause the table to be locked if the RDS is version 5.6 or below. RDS 5.6, however, supports online reconstruction.

2.Space occupied by large table indexes is greater than the data storage

Best practice: Useless or repeated indexes in the table need to get deleted. When deleting indexes, pay special attention to whether the index is still in use.

3.A large table is mainly used to store log type service data. Data is usually inserted and not queried

Best practice: Data in the table can be compressed using a TokuDB engine, which offers more than three times the average compression efficiency. You should note that using the TokuDB engine requires adjusting the TokuDB buffer. Please refer to parameter optimization loose_tokudb_buffer_pool_ratio.

Log Files

ApsaraDB for RDS MySQL uses a master/slave M-M high availability architecture. Mater/slave data synchronization relies on binlog. RDS will regularly back up the logs to OSS and then clear the local binlog to diminish the amount of space it takes up. When log storage encounters an error like the one in the figure below, the rate at which binlog grows may exceed the upload speed from RDS to OSS.

This will cause binlog to grow even faster. Under such circumstances, you must optimize the database to reduce the rate at which changes get done.

2

Common scenarios:

1.One of the users encountered a case where an application was frequently making updates to a table that consisted of several large fields. Because binlog records the entire row in its original format, the binlog, in this case, was growing entirely too quickly. If you want more information on the topic, refer to the section on rapid binlog growth due to frequent updates to large MySQL fields. Therefore, when you first begin designing an application, it is best to avoid large fields like varchar(8000), text, blob, and clob.

2.Another case is one where the master and slave copy hang or are interrupted. This results in the binlog not being transferred from the master database to the slave database, causing the binlog on the master database to grow unnecessarily. If you encounter this kind of problem, it is recommended to open a support ticket for immediate processing.

Temporary Files

These are files written from memory to disk when the database performs large file operations, and memory proves insufficient. This may lead to the creation of substantial temporary files during significant database operations (order by, group by, distinct).

For example, in the figure below, the database frequently executes the order by statements without using an index, leading to the creation of a large number of temporary files.

3

Common scenarios:

1.When the growth of your temporary file storage is due to SQL sorting, you can use show process list to find the offending SQL and kill it quickly.

2.Meanwhile, an appropriate index should be added to the sorted SQL to avoid further sorting. This solution gets to the root of the problem and prevents sorted SQL from appearing in the database.

3.A limit can be set on the size of temporary storage to keep sorting from taking up too much space. For details, please refer to optimization of the RDS parameter loose_rds_max_tmp_disk_space.

System Files

This refers to the files created during the installation of the database. These system files are crucial to the normal operation. For MySQL, these files include ibdatal and ib_logfile0. The below graphic shows “other files” taking up a large amount of space. In this situation, you can refer to how to locate the problem causing ibdatal to grow continuously.

4

Common scenarios:

1The ibdatal file contains a large number of undo_log. In this case, you should upgrade to version 5.6 or higher which has an independent purge thread. This allows undo logs to be recycled quickly and enables the undo tablespace file to be set separately, which prevents it from being used in conjunction with ibdatal.

2.Meanwhile, an ibdatal file can be reconstructed using logical migration.

3.Pay critical attention to the impact of the pending transaction on the undo in the database as well the INNODB_TRX view in the database.

Conclusion

In conclusion, issues concerning storage space can be complicated, but following best practice from the very beginning of system design can save a lot of complications later on.

To learn more about ApsaraDB for RDS MySQL, visit the official documentation page.

相关实践学习
如何快速连接云数据库RDS MySQL
本场景介绍如何通过阿里云数据管理服务DMS快速连接云数据库RDS MySQL,然后进行数据表的CRUD操作。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
SQL 关系型数据库 MySQL
MySQL:ERROR 1286 (42000): Unknown storage engine 'MyISAM'
MySQL:ERROR 1286 (42000): Unknown storage engine 'MyISAM'
150 0
|
关系型数据库 MySQL Shell
MySQL报错:Got error 28 from storage engine
MySQL报错:Got error 28 from storage engine
107 0
|
缓存 关系型数据库 MySQL
彻底解决mysql报错:1030, ‘Got error 28 from storage engine‘
最近听粉丝说,服务器系统盘满了,如何解决,我说这个应该网上能找到解决方案的,他们说网上很多人都说清一下缓存,但是并没有教实际操作,这让我们这些小白就很难受。 在此,我特意去网上看了看,发现网上文章千篇一律几乎没有教实际解决的,全是说清一下缓存就好了,但是都没有实际步骤,这让小白很懵逼。
946 0
彻底解决mysql报错:1030, ‘Got error 28 from storage engine‘
|
关系型数据库 MySQL
MySQL:ERROR 1286 (42000): Unknown storage engine 'MyISAM'
MySQL:ERROR 1286 (42000): Unknown storage engine 'MyISAM'
|
关系型数据库 MySQL