Troubleshooting RDS Performance (MySQL, SQL SERVER and MongoDB)

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 MongoDB,独享型 2核8GB
推荐场景:
构建全方位客户视图
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
简介: A lot of questions have been raised by our users concerning why ApsaraDB for RDS sometimes performs worse than self-built databases on RDS.

Database_tuning_practices

A lot of questions have been raised by our users concerning why ApsaraDB for RDS sometimes perform worse than self-built databases on RDS.

Firstly, all tests must be fair if the performances are to be compared. As a public database service, Alibaba Cloud ApsaraDB for RDS must prioritize on high-availability and security, which can sometimes work against performance. Users, of course, won't be willing to use an unstable, unsafe service no matter how powerful it may be. To ensure stability, RDS must have master and slave nodes, even setting them up in different data centers. This means that if an error occurs in one data center, the system can easily switch to another to keep providing services.

ApsaraDB for RDS must also ensure data security. After all, who would want to continue using our services if their data was stolen? As a service vendor, we must consider the security of users' data. Because of this, ApsaraDB adds an intermediate layer to intercept SQL injection requests, and applies the highest security standards for base-level data writes to prevent data loss if the host doesn't shut down normally.

Regarding performance, the Alibaba Cloud ApsaraDB for RDS source code team is continually making optimizations to MySQL. Both performance and stability are higher than in the community edition, as evidenced by standard benchmark tests. Here we will summarize some of the issues with the performance tests between RDS and self-built databases.

I. Network Differences

1. Available Zones

ApsaraDB for RDS can be divided into single-zone and the multi-zone RDS. For single-zone RDS, the master and slave databases are in the same data centers, while for the multi-zone RDS, the master and slave databases are in different data centers. As a result, the ECS and RDS must be built in the same zone during RDS testing.

2. Network Links

ECS to RDS contains many network links, such as ECS-->DNS-->SLB-->Proxy-->DB. Self-built databases on ECS have 2 (ECS-->ECS), meaning RDS has 3 more links than self-built databases on ECS.

3. Case Study

  1. A particular online company found that performance had suffered after migrating its system to the cloud.
  2. The application code and database configuration are identical.

1

RDS access links:

2

II. Configuration Differences

1. Specification Configuration

The RDS specification configuration mainly includes memory and CPU.
The ECS and RDS CPUs must have the same number of cores for RDS testing.

2. Parameter Configuration

(1) Security configuration: RDS applies the highest protection standards when committing transactions and flushing binlogs in order to better ensure data security.

  1. The innodb_flush_log_at_trx_commit parameters specify the log write frequency after the transaction is committed by InnoDB. When the value is 1, the log buffer will be written to the log file and flushed to disk during each transaction commit. 1 is the default value. This configuration is the safest, but there is a certain performance loss due to the operation of disk I/O during each transaction commit.
  2. The sync_binlog parameter is the frequency at which the MySQL binlog syncs to disk. After the binlog is written sync_binlog times, it will be flushed to disk by MySQL. A value of 1 is the safest. It will synchronize the binary log once after writing each statement or transaction, so at most one statement or transaction log will be lost even in the event of total failure. The trade off is that this is the slowest configuration.

(2) Performance configuration: RDS allows user configured parameters except for specification parameters. Most of the parameters have been optimized by the official team, so most users can run it out of the box without having to adjust any parameters. However, while these parameters are suitable for most application scenarios, there will be times where customization is appropriate for performance purposes.

tmp_table_size

Function: This parameter is used to determine the maximum value of the internal temporary memory table, which will be assigned to each thread (The hard caps are the values of tmp_table_size and max_heap_table_size). If the temporary memory table exceeds this limit, MySQL will automatically convert it to the disk-based MyISAM table. When optimizing the query statement, avoid using temporary tables. If you really cannot avoid using them, they must be ensured in memory.

Phenomenon: If temporary tables are used when a complex SQL statement contains group by/distinct, which cannot be optimized through an index, it will lead to longer SQL execution.

Recommendation: If the application involves a lot of group by/distinct statements, and the database has enough memory, you can increase the value of tmp_table_size (max_heap_table_size) to improve query performance.

query_cache_size

Function: This parameter is used to control the size of MySQL query cache memory. If activated, MySQL will first lock the query cache before execution of each query, and then determine whether the query already exists in the query cache. If it is in the query cache, the result will be returned immediately. If not, the engine query and other operations will be carried out. At the same time, operations like insert, update and delete will cause the query cache to fail. The failure also includes any changes in structure or index. Recovering from cache failure can be costly and creates heavy stress on MySQL. Query cache can be very useful if the database is not frequently updated. However, if you write data frequently, and to a small number of tables, the query cache lock function will often cause lock conflicts. A lock conflict occurs when you attempt to write or read a table that is currently in query cache lock. Since you cannot operate on the table until the lock is lifted, this feature can decrease the efficiency of your select queries.

Phenomenon: The database goes through a number of different statuses, including checking query cache, waiting for query cache lock, and storing results in query cache;

Recommendation: The query cache function is disabled in RDS by default. If you enable query cache in your instance, you can choose to disable it when you encounter any of the above statuses. Of course, query cache can be quite useful in some instances, for example it can be a handy tool for resolve database performance issues.

3. Case Study

  1. A user is migrating a local service system to the cloud
  2. The execution time on RDS is twice that of the offline self-built database

The user's local parameter configuration:

join_buffer_size = 128M

read_rnd_buffer_size = 128M

tmp_table_size = 128M

RDS parameter configuration

join_buffer_size = 1M

read_buffer_size = 1M

tmp_table_size =256K

III. Framework differences

1. Master-Slave Mechanism

RDS utilizes a high availability operational model called the master-slave mechanism. At the same time, it also uses a semi-synchronous mechanism, which is an improvement on the asynchronous mechanism used by MySQL. After the master database finishes executing a transaction submitted by the client, it waits for acknowledgment from the slave database, which is responsible for writing the data to the relay log. Only then does the master database send the reply to the client, rather than sending it right away. The semi-synchronous mechanism offers data security improvements compared to the asynchronous mechanism, however it also causes a certain degree of latency, which is at least equal to the round-trip time for the TCP/IP connection. This means that the semi-synchronous mechanism increases the response time of the transaction.

3

Note: The same problems also exist when the high availability SQLSERVER uses a mirror.

相关实践学习
MongoDB数据库入门
MongoDB数据库入门实验。
快速掌握 MongoDB 数据库
本课程主要讲解MongoDB数据库的基本知识,包括MongoDB数据库的安装、配置、服务的启动、数据的CRUD操作函数使用、MongoDB索引的使用(唯一索引、地理索引、过期索引、全文索引等)、MapReduce操作实现、用户管理、Java对MongoDB的操作支持(基于2.x驱动与3.x驱动的完全讲解)。 通过学习此课程,读者将具备MongoDB数据库的开发能力,并且能够使用MongoDB进行项目开发。   相关的阿里云产品:云数据库 MongoDB版 云数据库MongoDB版支持ReplicaSet和Sharding两种部署架构,具备安全审计,时间点备份等多项企业能力。在互联网、物联网、游戏、金融等领域被广泛采用。 云数据库MongoDB版(ApsaraDB for MongoDB)完全兼容MongoDB协议,基于飞天分布式系统和高可靠存储引擎,提供多节点高可用架构、弹性扩容、容灾、备份回滚、性能优化等解决方案。 产品详情: https://www.aliyun.com/product/mongodb
目录
相关文章
|
23天前
|
存储 关系型数据库 MySQL
一个项目用5款数据库?MySQL、PostgreSQL、ClickHouse、MongoDB区别,适用场景
一个项目用5款数据库?MySQL、PostgreSQL、ClickHouse、MongoDB——特点、性能、扩展性、安全性、适用场景比较
|
2月前
|
SQL 关系型数据库 MySQL
创建包含MySQL和SQLServer数据库所有字段类型的表的方法
创建一个既包含MySQL又包含SQL Server所有字段类型的表是一个复杂的任务,需要仔细地比较和转换数据类型。通过上述方法,可以在两个数据库系统之间建立起相互兼容的数据结构,为数据迁移和同步提供便利。这一过程不仅要考虑数据类型的直接对应,还要注意特定数据类型在不同系统中的表现差异,确保数据的一致性和完整性。
28 4
|
23天前
|
存储 关系型数据库 MySQL
四种数据库对比MySQL、PostgreSQL、ClickHouse、MongoDB——特点、性能、扩展性、安全性、适用场景
四种数据库对比 MySQL、PostgreSQL、ClickHouse、MongoDB——特点、性能、扩展性、安全性、适用场景
|
3月前
|
SQL 关系型数据库 数据库
数据库空间之谜:彻底解决RDS for SQL Server的空间难题
【8月更文挑战第16天】在管理阿里云RDS for SQL Server时,合理排查与解决空间问题是确保数据库性能稳定的关键。常见问题包括数据文件增长、日志文件膨胀及索引碎片累积。利用SQL Server的动态管理视图(DMV)可有效监测文件使用情况、日志空间及索引碎片化程度。例如,使用`sp_spaceused`检查文件使用量,`sys.dm_db_log_space_usage`监控日志空间,`sys.dm_db_index_physical_stats`识别索引碎片。同时,合理的备份策略和文件组设置也有助于优化空间使用,确保数据库高效运行。
69 2
|
3月前
|
SQL 关系型数据库 MySQL
“震撼揭秘!Flink CDC如何轻松实现SQL Server到MySQL的实时数据同步?一招在手,数据无忧!”
【8月更文挑战第7天】随着大数据技术的发展,实时数据同步变得至关重要。Apache Flink作为高性能流处理框架,在实时数据处理领域扮演着核心角色。Flink CDC(Change Data Capture)组件的加入,使得数据同步更为高效。本文介绍如何使用Flink CDC实现从SQL Server到MySQL的实时数据同步,并提供示例代码。首先确保SQL Server启用了CDC功能,接着在Flink环境中引入相关连接器。通过定义源表与目标表,并执行简单的`INSERT INTO SELECT`语句,即可完成数据同步。
271 1
|
3月前
|
Java 应用服务中间件 Maven
从零到英雄:一步步构建你的首个 JSF 应用程序,揭开 JavaServer Faces 的神秘面纱
【8月更文挑战第31天】JavaServer Faces (JSF) 是一种强大的 Java EE 标准,用于构建企业级 Web 应用。它提供了丰富的组件库和声明式页面描述语言 Facelets,便于开发者快速开发功能完善且易于维护的 Web 应用。本文将指导你从零开始构建一个简单的 JSF 应用,包括环境搭建、依赖配置、Managed Bean 编写及 Facelets 页面设计。
85 0
|
3月前
|
SQL 关系型数据库 MySQL
【超全整理】SQL日期与时间函数大汇总会:MySQL与SQL Server双轨对比教学,助你轻松搞定时间数据处理难题!
【8月更文挑战第31天】本文介绍了在不同SQL数据库系统(如MySQL、SQL Server、Oracle)中常用的日期与时间函数,包括DATE、NOW()、EXTRACT()、DATE_ADD()、TIMESTAMPDIFF()及日期格式化等,并提供了具体示例。通过对比这些函数在各系统中的使用方法,帮助开发者更高效地处理日期时间数据,满足多种应用场景需求。
230 0
|
3月前
|
SQL 关系型数据库 MySQL
SQL Server、MySQL、PostgreSQL:主流数据库SQL语法异同比较——深入探讨数据类型、分页查询、表创建与数据插入、函数和索引等关键语法差异,为跨数据库开发提供实用指导
【8月更文挑战第31天】SQL Server、MySQL和PostgreSQL是当今最流行的关系型数据库管理系统,均使用SQL作为查询语言,但在语法和功能实现上存在差异。本文将比较它们在数据类型、分页查询、创建和插入数据以及函数和索引等方面的异同,帮助开发者更好地理解和使用这些数据库。尽管它们共用SQL语言,但每个系统都有独特的语法规则,了解这些差异有助于提升开发效率和项目成功率。
257 0
|
4月前
|
关系型数据库 MySQL Serverless
函数计算产品使用问题之调用RDS MySQL的步骤是怎样的
函数计算产品作为一种事件驱动的全托管计算服务,让用户能够专注于业务逻辑的编写,而无需关心底层服务器的管理与运维。你可以有效地利用函数计算产品来支撑各类应用场景,从简单的数据处理到复杂的业务逻辑,实现快速、高效、低成本的云上部署与运维。以下是一些关于使用函数计算产品的合集和要点,帮助你更好地理解和应用这一服务。
|
4月前
|
SQL NoSQL API
MongoDB 增删改查 常用sql总结
MongoDB 增删改查 常用sql总结
134 1