开发者社区> 芷沁> 正文
阿里云
为了无法计算的价值
打开APP
阿里云APP内打开

Best Practices for RDS MySQL Storage Optimization

简介: This article talks about spatial optimization of ApsaraDB for RDS MySQL. We will discuss the various objects which comprise the RDS Console with an understanding of the composition of objects.
+关注继续查看

Original article: https://yq.aliyun.com/articles/55594

Bucket storage is critical to running Alibaba Cloud ApsaraDB for RDS. In fact, bucket storage is among the top 5 topics we see in support tickets. When the actual volume used exceeds the storage purchased for the instance, the instance will be locked. It will prevent the applications from writing or updating data, which usually results in application errors.

You can set warning thresholds for storage in the ApsaraDB for RDS console. When the used storage on an instance reaches the warning threshold, you will receive a warning message. At this time, you may need to find out whether or not the increase in used storage is reasonable. If so, you should perform an elastic upgrade on the instance.

Local Upgrade vs. Cross-Host Upgrade

It should be noted that an elastic upgrade may take one of two forms: a local upgrade/degrade or a cross-host upgrade. In the former, the disk storage of the host where the instance is located is sufficient to accommodate the upgrade. In this situation, the upgrade is swift and will not influence your application.

However, in the latter, the remaining disk storage of the host where the instance is present is not sufficient to accommodate the upgrade, which means that the instance needs migration to another host with enough disk resources. In this case, the upgrade can take quite a while depending on how much storage is required by the instance.

Upgrade time gets further extended by the fact that the data needs to be both recovered and backed up onto the new host and data must be synchronized between the old and new instances. Finally, all existing connections in the database get broken when the upgrade is complete (Note: a high-security access link does not have this issue). If the increase in storage seems unreasonable, then you have to make a quick decision on how to proceed.

Therefore, we need to understand the items that take up RDS storage. There are five storage types in the RDS console, namely total disk storage, data storage, log storage, temporary file storage, and system file storage. In the following sections, we will look at four of the five storage types that typically causes confusion to RDS users.

Data Files

Data files refer to the files stored in the storage. In the database, they are tables. The tables are mainly composed of data and indexes. Therefore, when you find that your data files are occupying a great deal of instance storage, you should further check which table is most responsible. You can see which table is holding the most system storage through data dictionaries:

select TABLE_SCHEMA,TABLE_NAME,INDEX_LENGTH/1024/1024 as index_M,DATA_LENGTH/1024/1024 as data_M from TABLES order by (INDEX_LENGTH+DATA_LENGTH) desc limit 10

Preparation prevents poor performance, so we should take certain precautions against excessive storage use starting from the application design phase:

  • The size of disk space depends on potential data growth trends;
  • Whether to delete or archive data depends on the data retention period;
  • When designing tables, select reasonable data types, field sizes, and storage engines for sharding and table sharding.

Let's take the example in the figure below. As we can see, data storage occupies a significant chunk of the total storage on the instance. You can see which tables in the database are utilizing the most space with the above method:

1

Common scenarios:

1.The storage does not get released even after table data gets deleted.

Best practice: Reconstruction of the table is needed by optimizing the table. This method may cause the table to be locked if the RDS is version 5.6 or below. RDS 5.6, however, supports online reconstruction.

2.Space occupied by large table indexes is greater than the data storage

Best practice: Useless or repeated indexes in the table need to get deleted. When deleting indexes, pay special attention to whether the index is still in use.

3.A large table is mainly used to store log type service data. Data is usually inserted and not queried

Best practice: Data in the table can be compressed using a TokuDB engine, which offers more than three times the average compression efficiency. You should note that using the TokuDB engine requires adjusting the TokuDB buffer. Please refer to parameter optimization loose_tokudb_buffer_pool_ratio.

Log Files

ApsaraDB for RDS MySQL uses a master/slave M-M high availability architecture. Mater/slave data synchronization relies on binlog. RDS will regularly back up the logs to OSS and then clear the local binlog to diminish the amount of space it takes up. When log storage encounters an error like the one in the figure below, the rate at which binlog grows may exceed the upload speed from RDS to OSS.

This will cause binlog to grow even faster. Under such circumstances, you must optimize the database to reduce the rate at which changes get done.

2

Common scenarios:

1.One of the users encountered a case where an application was frequently making updates to a table that consisted of several large fields. Because binlog records the entire row in its original format, the binlog, in this case, was growing entirely too quickly. If you want more information on the topic, refer to the section on rapid binlog growth due to frequent updates to large MySQL fields. Therefore, when you first begin designing an application, it is best to avoid large fields like varchar(8000), text, blob, and clob.

2.Another case is one where the master and slave copy hang or are interrupted. This results in the binlog not being transferred from the master database to the slave database, causing the binlog on the master database to grow unnecessarily. If you encounter this kind of problem, it is recommended to open a support ticket for immediate processing.

Temporary Files

These are files written from memory to disk when the database performs large file operations, and memory proves insufficient. This may lead to the creation of substantial temporary files during significant database operations (order by, group by, distinct).

For example, in the figure below, the database frequently executes the order by statements without using an index, leading to the creation of a large number of temporary files.

3

Common scenarios:

1.When the growth of your temporary file storage is due to SQL sorting, you can use show process list to find the offending SQL and kill it quickly.

2.Meanwhile, an appropriate index should be added to the sorted SQL to avoid further sorting. This solution gets to the root of the problem and prevents sorted SQL from appearing in the database.

3.A limit can be set on the size of temporary storage to keep sorting from taking up too much space. For details, please refer to optimization of the RDS parameter loose_rds_max_tmp_disk_space.

System Files

This refers to the files created during the installation of the database. These system files are crucial to the normal operation. For MySQL, these files include ibdatal and ib_logfile0. The below graphic shows “other files” taking up a large amount of space. In this situation, you can refer to how to locate the problem causing ibdatal to grow continuously.

4

Common scenarios:

1The ibdatal file contains a large number of undo_log. In this case, you should upgrade to version 5.6 or higher which has an independent purge thread. This allows undo logs to be recycled quickly and enables the undo tablespace file to be set separately, which prevents it from being used in conjunction with ibdatal.

2.Meanwhile, an ibdatal file can be reconstructed using logical migration.

3.Pay critical attention to the impact of the pending transaction on the undo in the database as well the INNODB_TRX view in the database.

Conclusion

In conclusion, issues concerning storage space can be complicated, but following best practice from the very beginning of system design can save a lot of complications later on.

To learn more about ApsaraDB for RDS MySQL, visit the official documentation page.

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
Kubernetes必备知识: StorageClass
StorageClass提供了一种描述存储类(class)的方法,不同的class可能会映射到不同的服务质量等级和备份策略或其他策略等。 StorageClass 对象中包含 provisioner、parameters 和 reclaimPolicy 字段,当需要动态分配 PersistentVolume 时会使用到。当创建 StorageClass 对象时,设置名称和其他参数,一旦创建了对象就不能再对其更新。也可以为没有申请绑定到特定 class 的 PVC 指定一个默认的 StorageClass 。
1396 0
协议森林17 我和你的悄悄话 (SSL/TLS协议)
作者:Vamei 出处:http://www.cnblogs.com/vamei 严禁任何形式转载。   TLS名为传输层安全协议(Transport Layer Protocol),这个协议是一套加密的通信协议。
966 0
ifconfig报错:SIOCSIFFLAGS: Operation not permitted
# insmod mt7601Usta.ko  rtusb init rt2870 --->usbcore: registered new interface driver rt2870   # iwconfiglo        no wireless extensions.
5456 0
一次有意思的错选执行计划问题定位(涉及SYS_OP_C2)
这两天和广分的兄弟看了一个问题,比较有意思,过程也比较曲折。。。 问题现象: 1. 11g的库,话说有一个应用程序新上线,应用中使用了绑定变量的方式执行一条简单的SQL,例如select a from b where c = :x,c列是该表复合主键的前导列,表定义是varchar2类型,从spotlight监控看这条SQL的执行计划是全表扫描,一次执行要1个小时,这张表是运行很久的引用分区表,数据量是亿级,测试的时候正常,但很显然测试的数据量可能和生产非常不一致,导致没察觉。
1033 0
+关注
芷沁
https://www.alibabacloud.com/blog/
417
文章
1
问答
文章排行榜
最热
最新
相关电子书
更多
低代码开发师(初级)实战教程
立即下载
阿里巴巴DevOps 最佳实践手册
立即下载
冬季实战营第三期:MySQL数据库进阶实战
立即下载