Troubleshooting RDS Performance (MySQL, SQL SERVER and MongoDB)

本文涉及的产品
云数据库 MongoDB,独享型 2核8GB
推荐场景:
构建全方位客户视图
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介: A lot of questions have been raised by our users concerning why ApsaraDB for RDS sometimes performs worse than self-built databases on RDS.

Database_tuning_practices

A lot of questions have been raised by our users concerning why ApsaraDB for RDS sometimes perform worse than self-built databases on RDS.

Firstly, all tests must be fair if the performances are to be compared. As a public database service, Alibaba Cloud ApsaraDB for RDS must prioritize on high-availability and security, which can sometimes work against performance. Users, of course, won't be willing to use an unstable, unsafe service no matter how powerful it may be. To ensure stability, RDS must have master and slave nodes, even setting them up in different data centers. This means that if an error occurs in one data center, the system can easily switch to another to keep providing services.

ApsaraDB for RDS must also ensure data security. After all, who would want to continue using our services if their data was stolen? As a service vendor, we must consider the security of users' data. Because of this, ApsaraDB adds an intermediate layer to intercept SQL injection requests, and applies the highest security standards for base-level data writes to prevent data loss if the host doesn't shut down normally.

Regarding performance, the Alibaba Cloud ApsaraDB for RDS source code team is continually making optimizations to MySQL. Both performance and stability are higher than in the community edition, as evidenced by standard benchmark tests. Here we will summarize some of the issues with the performance tests between RDS and self-built databases.

I. Network Differences

1. Available Zones

ApsaraDB for RDS can be divided into single-zone and the multi-zone RDS. For single-zone RDS, the master and slave databases are in the same data centers, while for the multi-zone RDS, the master and slave databases are in different data centers. As a result, the ECS and RDS must be built in the same zone during RDS testing.

2. Network Links

ECS to RDS contains many network links, such as ECS-->DNS-->SLB-->Proxy-->DB. Self-built databases on ECS have 2 (ECS-->ECS), meaning RDS has 3 more links than self-built databases on ECS.

3. Case Study

  1. A particular online company found that performance had suffered after migrating its system to the cloud.
  2. The application code and database configuration are identical.

1

RDS access links:

2

II. Configuration Differences

1. Specification Configuration

The RDS specification configuration mainly includes memory and CPU.
The ECS and RDS CPUs must have the same number of cores for RDS testing.

2. Parameter Configuration

(1) Security configuration: RDS applies the highest protection standards when committing transactions and flushing binlogs in order to better ensure data security.

  1. The innodb_flush_log_at_trx_commit parameters specify the log write frequency after the transaction is committed by InnoDB. When the value is 1, the log buffer will be written to the log file and flushed to disk during each transaction commit. 1 is the default value. This configuration is the safest, but there is a certain performance loss due to the operation of disk I/O during each transaction commit.
  2. The sync_binlog parameter is the frequency at which the MySQL binlog syncs to disk. After the binlog is written sync_binlog times, it will be flushed to disk by MySQL. A value of 1 is the safest. It will synchronize the binary log once after writing each statement or transaction, so at most one statement or transaction log will be lost even in the event of total failure. The trade off is that this is the slowest configuration.

(2) Performance configuration: RDS allows user configured parameters except for specification parameters. Most of the parameters have been optimized by the official team, so most users can run it out of the box without having to adjust any parameters. However, while these parameters are suitable for most application scenarios, there will be times where customization is appropriate for performance purposes.

tmp_table_size

Function: This parameter is used to determine the maximum value of the internal temporary memory table, which will be assigned to each thread (The hard caps are the values of tmp_table_size and max_heap_table_size). If the temporary memory table exceeds this limit, MySQL will automatically convert it to the disk-based MyISAM table. When optimizing the query statement, avoid using temporary tables. If you really cannot avoid using them, they must be ensured in memory.

Phenomenon: If temporary tables are used when a complex SQL statement contains group by/distinct, which cannot be optimized through an index, it will lead to longer SQL execution.

Recommendation: If the application involves a lot of group by/distinct statements, and the database has enough memory, you can increase the value of tmp_table_size (max_heap_table_size) to improve query performance.

query_cache_size

Function: This parameter is used to control the size of MySQL query cache memory. If activated, MySQL will first lock the query cache before execution of each query, and then determine whether the query already exists in the query cache. If it is in the query cache, the result will be returned immediately. If not, the engine query and other operations will be carried out. At the same time, operations like insert, update and delete will cause the query cache to fail. The failure also includes any changes in structure or index. Recovering from cache failure can be costly and creates heavy stress on MySQL. Query cache can be very useful if the database is not frequently updated. However, if you write data frequently, and to a small number of tables, the query cache lock function will often cause lock conflicts. A lock conflict occurs when you attempt to write or read a table that is currently in query cache lock. Since you cannot operate on the table until the lock is lifted, this feature can decrease the efficiency of your select queries.

Phenomenon: The database goes through a number of different statuses, including checking query cache, waiting for query cache lock, and storing results in query cache;

Recommendation: The query cache function is disabled in RDS by default. If you enable query cache in your instance, you can choose to disable it when you encounter any of the above statuses. Of course, query cache can be quite useful in some instances, for example it can be a handy tool for resolve database performance issues.

3. Case Study

  1. A user is migrating a local service system to the cloud
  2. The execution time on RDS is twice that of the offline self-built database

The user's local parameter configuration:

join_buffer_size = 128M

read_rnd_buffer_size = 128M

tmp_table_size = 128M

RDS parameter configuration

join_buffer_size = 1M

read_buffer_size = 1M

tmp_table_size =256K

III. Framework differences

1. Master-Slave Mechanism

RDS utilizes a high availability operational model called the master-slave mechanism. At the same time, it also uses a semi-synchronous mechanism, which is an improvement on the asynchronous mechanism used by MySQL. After the master database finishes executing a transaction submitted by the client, it waits for acknowledgment from the slave database, which is responsible for writing the data to the relay log. Only then does the master database send the reply to the client, rather than sending it right away. The semi-synchronous mechanism offers data security improvements compared to the asynchronous mechanism, however it also causes a certain degree of latency, which is at least equal to the round-trip time for the TCP/IP connection. This means that the semi-synchronous mechanism increases the response time of the transaction.

3

Note: The same problems also exist when the high availability SQLSERVER uses a mirror.

相关实践学习
MongoDB数据库入门
MongoDB数据库入门实验。
快速掌握 MongoDB 数据库
本课程主要讲解MongoDB数据库的基本知识,包括MongoDB数据库的安装、配置、服务的启动、数据的CRUD操作函数使用、MongoDB索引的使用(唯一索引、地理索引、过期索引、全文索引等)、MapReduce操作实现、用户管理、Java对MongoDB的操作支持(基于2.x驱动与3.x驱动的完全讲解)。 通过学习此课程,读者将具备MongoDB数据库的开发能力,并且能够使用MongoDB进行项目开发。   相关的阿里云产品:云数据库 MongoDB版 云数据库MongoDB版支持ReplicaSet和Sharding两种部署架构,具备安全审计,时间点备份等多项企业能力。在互联网、物联网、游戏、金融等领域被广泛采用。 云数据库MongoDB版(ApsaraDB for MongoDB)完全兼容MongoDB协议,基于飞天分布式系统和高可靠存储引擎,提供多节点高可用架构、弹性扩容、容灾、备份回滚、性能优化等解决方案。 产品详情: https://www.aliyun.com/product/mongodb
目录
相关文章
|
7天前
|
SQL 存储 关系型数据库
【MySQL基础篇】全面学习总结SQL语法、DataGrip安装教程
本文详细介绍了MySQL中的SQL语法,包括数据定义(DDL)、数据操作(DML)、数据查询(DQL)和数据控制(DCL)四个主要部分。内容涵盖了创建、修改和删除数据库、表以及表字段的操作,以及通过图形化工具DataGrip进行数据库管理和查询。此外,还讲解了数据的增、删、改、查操作,以及查询语句的条件、聚合函数、分组、排序和分页等知识点。
【MySQL基础篇】全面学习总结SQL语法、DataGrip安装教程
|
7天前
|
SQL 存储 关系型数据库
MySQL/SqlServer跨服务器增删改查(CRUD)的一种方法
通过上述方法,MySQL和SQL Server均能够实现跨服务器的增删改查操作。MySQL通过联邦存储引擎提供了直接的跨服务器表访问,而SQL Server通过链接服务器和分布式查询实现了灵活的跨服务器数据操作。这些技术为分布式数据库管理提供了强大的支持,能够满足复杂的数据操作需求。
47 12
|
7天前
|
SQL NoSQL Java
Java使用sql查询mongodb
通过使用 MongoDB Connector for BI 和 JDBC,开发者可以在 Java 中使用 SQL 语法查询 MongoDB 数据库。这种方法对于熟悉 SQL 的团队非常有帮助,能够快速实现对 MongoDB 数据的操作。同时,也需要注意到这种方法的性能和功能限制,根据具体应用场景进行选择和优化。
30 9
|
26天前
|
SQL 存储 缓存
MySQL进阶突击系列(02)一条更新SQL执行过程 | 讲透undoLog、redoLog、binLog日志三宝
本文详细介绍了MySQL中update SQL执行过程涉及的undoLog、redoLog和binLog三种日志的作用及其工作原理,包括它们如何确保数据的一致性和完整性,以及在事务提交过程中各自的角色。同时,文章还探讨了这些日志在故障恢复中的重要性,强调了合理配置相关参数对于提高系统稳定性的必要性。
|
24天前
|
SQL 关系型数据库 MySQL
MySQL 高级(进阶) SQL 语句
MySQL 提供了丰富的高级 SQL 语句功能,能够处理复杂的数据查询和管理需求。通过掌握窗口函数、子查询、联合查询、复杂连接操作和事务处理等高级技术,能够大幅提升数据库操作的效率和灵活性。在实际应用中,合理使用这些高级功能,可以更高效地管理和查询数据,满足多样化的业务需求。
87 3
|
27天前
|
SQL 关系型数据库 MySQL
MySQL导入.sql文件后数据库乱码问题
本文分析了导入.sql文件后数据库备注出现乱码的原因,包括字符集不匹配、备注内容编码问题及MySQL版本或配置问题,并提供了详细的解决步骤,如检查和统一字符集设置、修改客户端连接方式、检查MySQL配置等,确保导入过程顺利。
|
29天前
|
SQL 存储 关系型数据库
MySQL进阶突击系列(01)一条简单SQL搞懂MySQL架构原理 | 含实用命令参数集
本文从MySQL的架构原理出发,详细介绍其SQL查询的全过程,涵盖客户端发起SQL查询、服务端SQL接口、解析器、优化器、存储引擎及日志数据等内容。同时提供了MySQL常用的管理命令参数集,帮助读者深入了解MySQL的技术细节和优化方法。
|
30天前
|
SQL Oracle 关系型数据库
SQL(MySQL)
SQL语言是指结构化查询语言,是一门ANSI的标准计算机语言,用来访问和操作数据库。 数据库包括SQL server,MySQL和Oracle。(语法大致相同) 创建数据库指令:CRATE DATABASE websecurity; 查看数据库:show datebase; 切换数据库:USE websecurity; 删除数据库:DROP DATABASE websecurity;
|
2月前
|
SQL 开发框架 .NET
突破T-SQL限制:利用CLR集成扩展RDS SQL Server的功能边界
CLR集成为SQL Server提供了强大的扩展能力,突破了T-SQL的限制,极大地拓展了SQL 的应用场景,如:复杂字符串处理、高性能计算、图像处理、机器学习集成、自定义加密解密等,使开发人员能够利用 .NET Framework的丰富功能来处理复杂的数据库任务。
|
2月前
|
SQL 监控 关系型数据库
SQL语句当前及历史信息查询-performance schema的使用
本文介绍了如何使用MySQL的Performance Schema来获取SQL语句的当前和历史执行信息。Performance Schema默认在MySQL 8.0中启用,可以通过查询相关表来获取详细的SQL执行信息,包括当前执行的SQL、历史执行记录和统计汇总信息,从而快速定位和解决性能瓶颈。