Percona Server 5.7 并行doublewrite 特性

简介: 在doublewrite buffer关闭状态下进行测试。我们想要隔离并行LRU刷新带来的影响,结果验证了设想。然后,重新开启doublewrite,发现从并行LRU刷新获得的好处很小。到底发生了什么?

In this blog post, we’ll discuss the ins and outs of Percona Server 5.7 parallel doublewrite.

在这篇文章中,我们将由里及外讨论Percona Server 5.7的并行doublewrite。

After implementing parallel LRU flushing as described in the previous post, we went back to benchmarking. At first, we tested with the doublewrite buffer turned off. We wanted to isolate the effect of the parallel LRU flusher, and the results validated the design. Then we turned the doublewrite buffer back on and saw very little, if any, gain from the parallel LRU flusher. What happened? Let’s take a look at the data:

在上篇文章(Percona Server 5.7: multi-threaded LRU flushing,详见文末:延伸阅读)中 ,我们描述了多线程LRU刷新线程的实现,现在让我们回到基准测试。首先,在doublewrite buffer关闭状态下进行测试。我们想要隔离并行LRU刷新带来的影响,结果验证了设想。然后,重新开启doublewrite,发现从并行LRU刷新获得的好处很小。到底发生了什么?我们先来看看数据:

image


We see that the doublewrite buffer mutex is gone as expected and that the top waiters are the rseg mutexes and the index lock (shouldn’t this be fixed in 5.7?). Then we checked PMP:

如上图,我们看到doublewrite buffer互斥量如预期一样消失了,最高的等待是rseg 互斥量和index锁(这不应该在5.7修复了么?)。接着,我们检查下PMP:

image


Again we see that PFS is not telling the whole story, this time due to a missing annotation in XtraDB. Whereas the PFS results might lead us to leave the flushing analysis and focus on the rseg/undo/purge or check the index lock, PMP clearly shows that a lack of free pages is the biggest source of waits. Turning on the doublewrite buffer makes LRU flushing inadequate again. This data, however, doesn’t tell us why that is.

我们再次看到,PFS并没有显示所有的内容,这里是因为XtraDB的缺陷(详见文末延伸阅读)。而PFS的结果让我们忽略刷新方面的分析,转而聚焦rseg/undo/purge或者索引锁的检查上。PMP清晰地展现缺少空闲也是最大等待的源头。打开doublewriter buffer又会导致LRU刷新不足。然而,这些数据并没有告诉我们为什么会这样。

To see how enabling the doublewrite buffer makes LRU flushing perform worse, we collect PFS and PMP data only for the server flusher (cleaner coordinator, cleaner worker, and LRU flusher) threads and I/O completion threads:

为了了解为何开启了doublewrite buffer 会使LRU刷新变得糟糕,我们收集了PFS和PMP数据,这些数据只包含刷新相关(cleaner coordinator,cleaner worker,以及LRU flusher)线程和I/O相关线程:

image


If we zoom in from the whole server to the flushers only, the doublewrite mutex is back. Since we removed its contention for the single page flushes, it must be the batch doublewrite buffer usage by the flusher threads that causes it to reappear. The doublewrite buffer has a single area for 120 pages that is shared and filled by flusher threads. The page add to the batch action is protected by the doublewrite mutex, serialising the adds, and results in the following picture:

如果我们从整个服务放大到刷新线程,就又能看到douoblewrite mutex了。由于我们移除了单页刷新之间的争用,所以它会在刷新线程批量使用doublewrite buffer时重新出现。doublewrite buffer有一个有120个page的单独区域,刷新线程负责填充并共享使用。将页添加到批处理操作由doublewrite mutex保护,持续添加之后的结果如下图:

image


By now we should be wary of reviewing PFS data without checking its results against PMP. Here it is:

现在我们应该更谨慎地评估PFS数据,并与PMP进行对比。PMP结果如下:


image


As with the single-page flush doublewrite contention and the wait to get a free page in the previous posts, here we have an unannotated-for-Performance Schema doublewrite OS event wait (same bug 80979):

与之前文章中提到的单页刷新doublewrite争用,等待一个空闲的页的情景一样,这里我们有一个在Performance Schema中未被注解的doublewrite OS 事件。

image


This is as bad as it looks (the comment is outdated). A running doublewrite flush blocks any doublewrite page add attempts from all the other flusher threads for the duration of the flush (up to 120 data pages written twice to storage):

这看起来很糟糕(里面的注释可以不用关注,已经过时)。活跃的doublewrite刷新时会阻塞所有其他flush线程任何的doublewrite page添加(多达120个页写入两次存储):

image


The issue also occurs with MySQL 5.7 multi-threaded flusher but becomes more acute with the PS 5.7 multi-threaded LRU flusher. There is no inherent reason why all the parallel flusher threads must share the single doublewrite buffer. Each thread can have its own private buffer, and doing so allows us to add to the buffers and flush them independently. This means a lot of synchronisation simply disappears. Adding pages to parallel buffers is fully asynchronous:

使用MySQL 5.7多线程flush也会出现此问题,但Percona Server 5.7的多线程LRU 刷新尤为突出。但并发flush线程并非必须共享单个doublewrite buffer。每个线程都可以有自己的私有buffer,这样可以允许添加到buffer并单独刷新它们。这意味着大量的同步会消失。将页面添加到并行buffer完全是异步的。

image


And so is flushing them:

变成了下面的刷新模式:

image


This behavior is what we shipped in the 5.7.11-4 release, and the performance results were shown in a previous post. To see how the private doublewrite buffer affects flusher threads, let’s look at isolated data for those threads again.

这个特性是我们在5.7.11-4版本添加的,其性能提升效果在之前的文章(《Percona Server 5.7 performance improvements》)中已经展示。想知道私有doublewrite buffer对flush线程的影响,让我们再看下这些线程的隔离数据:

Performance Schema:

image


It shows the redo log mutex as the current top contention source from the PFS point of view, which is not caused directly by flushing.

从PFS的角度看,redo log互斥量是当前使用量最多的争用来源,这不是直接由flush引起的。

PMP data looks better too:

PMP的数据看起来好点:

image


The buf_dblwr_flush_buffered_writes now waits for its own thread I/O to complete and doesn’t block other threads from proceeding. The other top mutex waits belong to the LRU list mutex, which is again not caused directly by flushing.

buf_dblwr_flush_buffered_writes 现在等待自己的线程I/O完成,并不阻塞其他线程运行。其他较高的互斥量等待属于LRU list mutex,这也不是由flush引起的。

This concludes the description of the current flushing implementation in Percona Server. To sum up, in these post series we took you through the road to the current XtraDB 5.7 flushing implementation:

  • Under high concurrency I/O-bound workloads, the server has a high demand for free buffer pages. This demand can be satisfied by either LRU batch flushing, either single page flushing.
  • Single page flushes cause a lot of doublewrite buffer contention and are bad even without the doublewrite.
  • Same as in XtraDB 5.6, we removed the single page flushing altogether.
  • Existing cleaner LRU flushing could not satisfy free page demand.
  • Multi-threaded LRU flushing design addresses this issue – if the doublewrite buffer is disabled.
  • If the doublewrite buffer is enabled, MT LRU flushing contends on it, negating its improvements.
  • Parallel doublewrite buffers address this bottleneck.

以下是对当前Percona Server的flush实现描述。总结一下,在这些的系列文章中,我们重现了XtraDB 5.7刷新实现的风雨历程:

  • 在I/O密集的工作负载下,server对空闲buffer页面的需求量很大,这要求通过批量的LRU刷新,或者单个页面刷新来满足需要。
  • 单个页面flush会导致大量的doublewrite buffer争用,即使没开启doublewrite也是糟糕的。
  • 与XtraDB 5.6 相同,我们一并移除了单页flush。
  • 现有的LRU刷新机制不能满足空闲页的需求。
  • 多线程LRU刷新解决了这个问题——如果doublewrite buffer关闭掉。
  • 如果开启了doublewrite,则多线程的LRU flush会争用它,也会使得性能下降。
  • 并行的doublewrite buffer解决了这个问题。

原文发布时间为:2017-10-16
原文作者:Laurynas Biveinis and Alexey Stroganov
翻译团队:知数堂藏经阁项目 - 天一阁
本文来自云栖社区合作伙伴“老叶茶馆”,了解相关信息可以关注“老叶茶馆”微信公众号

相关文章
|
SQL 存储 算法
MySQL 8.0 新的火山模型执行器
# MySQL的总体架构 通常我们认为MySQL的整体架构如下, ![1.png](https://ata2-img.oss-cn-zhangjiakou.aliyuncs.com/46a99cb928955a5c5759115e2e6ba1fe.png) 官方10年前开始就一直在致力于优化器代码的重构工作,目的是能确保在SQL的执行过程中有清晰的阶段,包括分离Parse和Resol
2805 0
MySQL 8.0 新的火山模型执行器
|
SQL 移动开发 算法
MySQL 8.0.23 Hypergraph Join Optimizer代码详解
MySQL Join MySQL本身没有常规意义上的执行计划,一般情况就是通过JOIN和QEP_TAB这两个结构组成。QEP_TAB 的全称是Query Execution Plan Table,这个“Table“可以是物理表、内存表、常量表、子查询的结果表等等。作为整个单独JOIN执行计划载体之前还承担着整个执行路径的调用和流转,但是从8.0.20后,全面的生成了独立的
1952 0
MySQL 8.0.23 Hypergraph Join Optimizer代码详解
|
8月前
|
存储 关系型数据库 分布式数据库
PolarDB开源进阶篇:深度解析与实战优化指南
PolarDB是阿里云开源的云原生数据库,采用计算-存储分离架构,结合高性能共享存储与Parallel Raft多副本一致性协议,实现微秒级延迟和卓越性能。本文深入解析其架构设计,涵盖智能调度层、性能优化技巧(如查询优化器调优和分布式事务提升)、高可用与容灾配置、扩展功能开发指南以及监控运维体系。同时,通过电商平台优化案例展示实际应用效果,并展望未来演进方向,包括AI结合、多模数据库支持及Serverless架构发展。作为云原生数据库代表,PolarDB为开发者提供了强大支持和广阔前景。
495 16
|
存储 关系型数据库 MySQL
(十五)MySQL命令大全:以后再也不用担心忘记SQL该怎么写啦~
相信大家在编写SQL时一定有一个困扰,就是明明记得数据库中有个命令/函数,可以实现自己需要的功能,但偏偏不记得哪个命令该怎么写了,这时只能靠盲目的去百度,以此来寻找自己需要的命令。
575 28
|
SQL 关系型数据库 MySQL
在 MySQL 中使用派生表
【8月更文挑战第11天】
384 0
在 MySQL 中使用派生表
|
缓存 监控 算法
如何调整InnoDB的LRU算法以提高效率?
【5月更文挑战第14天】如何调整InnoDB的LRU算法以提高效率?
241 2
|
SQL 关系型数据库 MySQL
MySQL 聚合函数深入讲解与实战演练
MySQL 聚合函数深入讲解与实战演练
|
存储 关系型数据库 分布式数据库
【PolarDB 开源】PolarDB 存储引擎优化:PolarStore 的深度解析与优化
【5月更文挑战第25天】PolarDB的PolarStore存储引擎以其高效索引和优化的压缩算法提升数据存储与访问性能。通过并发控制保证事务正确性,同时支持数据压缩和索引优化。在实际应用中,优化包括调整索引结构、数据分区、事务管理及定期数据库维护。结合业务需求进行深度优化,可最大化PolarStore的性能潜力,推动数据库系统发展。
425 0
MySQL锁机制及其优化
概述 在一般的数据库驱动的业务中,很大的一个难点就是:在最大程度地利用数据库的并发访问的同时,还要确保每个用户能以一致的方式读取和修改数据,为此,MySQL就有了锁(locking)的机制。频繁出现的锁的不仅本身消耗着资源,也影响着数据库的运行性能,因此,做好数据库的锁优化,对于数据库的性能具有很大意义。
2475 0
|
监控 关系型数据库 PostgreSQL
PostgreSQL bgwriter,walwriter,backend process 写磁盘的实时监控
标签 PostgreSQL , 背景 数据库有两大块buffer,wal buffer和shared buffer。 wal buffer是预写日志缓冲区。 shared buffer是数据页缓冲区。
2966 0