1. 背景介绍
部分金融行业客户的传统核心使用OceanBase替换国外商业数据库的过程中,需要针对业务模型和数据模型进行重构,数据库压测提出了针对单交易耗时和TPS的硬性指标,以下内容是OceanBase数据库调优过程中的一些具体优化实践。
2. 数据库优化方法论
相信大家都见过下面这张数据库访问优化的漏斗法则图,此图展示了数据库层面的优化方向的核心方向:每条SQL尽量使用更少的资源、更充分的利用数据库整体的资源,例如:可以通过业务逻辑和索引优化减少磁盘扫描的数据行数,通过业务逻辑优化减少彼此网络间传输的数据量,通过将复杂的计算逻辑放在端上执行来减少数据库层面的CPU和内存开销,通过将数据均匀打散在更多资源+并行计算的能力来提升整体的性能。
从上到下每一层优化法则可以达到的优化效果越来越不明显,但是优化的代价反而越来越大,也就意味着数据库的优化就像学生时代做考题一样,也应该由易到难。数据库优化是方法问题,也是意识问题。遇到数据库性能问题时优先恢复服务的想法毋庸置疑,但还是应该先定位性能瓶颈再确定优化方法更为稳妥,如果一上来就扩容,甚至切换和重启,不仅会导致解决问题的耗时更长同时可能会引发二次故障。
单纯数据库层面的具体优化思路是一方面,实际项目中不仅需要进行数据库层面的优化,也需要有全链路优化的意识,业务和开发人员看到的性能问题往往只是交易耗时长、TPS和吞吐上不去,第一反应就是数据库性能有问题,全链路优化的意识此时就显得尤为重要。
先整体再局部
一种常见的优化场景是这样的:开发人员反馈应用整体性能不行,从应用的日志上看SQL发送到返回期间的耗时很长,怀疑是数据库侧的问题,让DBA帮忙排查具体问题原因,然后DBA各种监控平台、监控指标、监控日志、性能优化分析工具一顿巡检,给出的结果是数据库上的SQL没有性能问题,应用侧需要排查下应用侧自身的问题。
最简单的应用SQL请求链路如下,开发人员关注的是从应用发送完SQL请求的时刻到应用读取到SQL结果时刻的间隔时间。
造成这段时间比较长的原因不是只有DB侧SQL执行耗时长这一个因素,整个链路上各个阶段的耗时都是影响因子,应用服务器、负载均衡、Proxy、DB各个阶段都可能有性能问题,所以性能优化的正确路径应该先从全局的视角出发,优先定位具体的性能瓶颈阶段,再深入定位具体的问题和解决办法了。
抓大放小
另一种常见的优化场景大概是下面这样:开发人员反馈整体性能不行,其中有一条SQL执行特别慢,让DBA帮忙排查下具体的SQL性能问题,然后DBA就开始动用各种监控平台、监控指标、监控日志、性能优化分析工具,一顿操作下来,确实把这条SQL的性能问题解决了,但是整体的性能还是没上去,然后接着优化下一条SQL。
SQL性能问题包含以下几种,SQL这么多我们当然需要有优化的优先级。
3. 全链路性能调优实战
先解阻塞性问题
如果业务的流程都还没跑通就开始进行性能调优就有些本末倒置了,这个阶段应该快速解决导致业务流程中断的问题,先让业务流程跑顺了,再尝试通过压测判断整个流程中的瓶颈点以及摸高。
常见的导致业务流程中断的报错有timeout、内存超限等,此类问题的解决办法通常也就三板斧:
1)数据库通过调整参数来满足业务场景要求,如通过调整ob_query_timeout、ob_trx_timeout、ob_trx_idle_timeout参数来解决单SQL、单事务、事务空闲超时的问题;通过调整OB转储的参数minor_freeze_times、freeze_trigger_percentage、writing_throttling_trigger_percentage来解大并发写入来不及转储的问题;通过调整ob_sql_work_area_percentage、_hash_area_size、_sort_area_size参数来解大并发读取内存不足的问题。
2)数据库提升租户规格来提升性能,解决这些阻塞性问题。
3)应用侧调整会话级别的超时参数、调整事务模型(拆分大事务)、降低并发数来解此类问题。
网络耗时优化
阻塞性问题解决之后,开始正式步入性能调优的范畴,此时开发人员反馈并发数不高的情况下数据库SQL普遍耗时比较高(超5ms),需要排查数据库是否存在性能问题。在查看了__all_virtual_sql_audit及__all_virtual_plan_stat之后发现数据库层面SQL的执行耗时与业务描述的普遍超过5ms并不符,初步怀疑是网络链路耗时问题,接着把整个网络链路图梳理清楚一下子就发现问题的原因了。
SLB的主可用区在主机房以及云下与云上的并网方式导致单条SQL会有两次跨机房的延迟增长,机房间的单次耗时为1ms,所以单条SQL至少有4ms的网络延迟,与业务描述的普遍超过5ms结果一致,将备机房云上与云下的网络直接打通,同时将SLB主可用区调整至备机房即可。
问题虽然很简单,侧面反应的是大胆假设、小心求证的做事方法,数据库性能优化的过程中,如果数据库资源并未出现瓶颈,而整体性能普遍较差时更多应该从整个链路上进行分析,判断是不是其他方面存在瓶颈点或者架构不合理导致。
业务模型优化
跨机房网络耗时优化之后,业务侧看到单SQL的执行耗时显著下降,优化效果很明显,然而摆在面前的是另一个难题:低并发时单交易的耗时不达标。
“单笔交易耗时都下不来,还谈什么性能压测”,这是压测伊始就和开发人员确立的原则,此时开发人员反馈无论是查询交易还是非查询交易,单交易耗时均达不到要求,而从数据库层面的单SQL执行耗时看,单SQL执行耗时基本不超过2ms,按照最大事务150条SQL算,数据库层面的SQL执行耗时不超过300ms,加上网络耗时大概75ms,单笔交易耗时应该不超过400ms,远没有达到开发人员反馈的秒级别耗时。
为了排查出问题的原因,针对四种典型的交易场景从应用服务器上进行了单笔交易的抓包分析,想定位具体的瓶颈点。
通过抓包分析,从网卡发送这条SQL开始到网卡收到这条SQL结束的耗时很短,基本均在1ms以内,排除数据库层面的问题,最终开发中心定位发现日志的级别和同步写日志的方式导致了单交易耗时增长,调整后发现性能明显好转,单交易耗时这一关算是过了。
然而我们的分析并未止步与此,我们针对典型交易的抓包结果进行分析,将单笔交易内交易SQL的执行耗时、框架SQL的执行耗时、SQL间的等待耗时进行拆分,评估各类耗时的比重,结果是SQL间空闲等待的耗时占整个交易耗时的比重近50%,事务模型这块还有很大的优化空间。
TOP SQL优化
紧接着开发人员就一路高歌猛进的加应用服务器,势要将数据库资源打满,压测出性能拐点,没办法,硬着头皮上吧。虽然并发上来之后慢SQL整体增长了很多,但是不能一上来就钻到某一条具体的SQL中,这个时候抓大放小的原则就体现的淋漓尽致了,这个过程中我们首先应该关注三类SQL:慢SQL、热点SQL、跨机SQL。
慢SQL优化-分区方式优化
慢SQL千千万,我们应该优先处理哪些SQL是摆在面前的第一个问题,OB的审计日志给我们提供了具体的线索,我们可以通过以下SQL来判断哪些SQL总体执行耗时更长。
select
tenant_name,
database_name as db_name,
sql_id,
substr(replace(statement, '\n', ' '), 1, 100) as statement,
svr_ip,
type,
executions as executions,
avg_exe_usec as avg_exe_usec,
(avg_exe_usec * executions) as total_exe_usec,
elapsed_time,
slowest_exe_usec,
slow_count,
hit_count,
large_querys,
rows_processed
from
__all_virtual_plan_stat a
inner join __all_virtual_database b on a.db_id = b.database_id
inner join __all_tenant c on a.tenant_id = c.tenant_id
where
c.tenant_id > 1000
order by
total_exe_usec desc,
a.tenant_id
limit
10;
从统计的结果中我们可以看出来哪类SQL的执行耗时排名前10位。
UPDATE MQ_PRODUCER_MSG SET last_update = '2022-06-13 15:37:58', status = 2 WHERE ( flow_id = 'f9f5bab19bf843aaae323c2e1e2e32d3' AND status = 1 )
拿Top 1的SQL举例吧,这条SQL的执行次数并不是最多的,但是总体耗时确是Top 1,单条SQL的平均耗时高达16ms,是需要重点优化的对象,下面是分析这张表的表结构和索引信息:
| MQ_PRODUCER_MSG | CREATE TABLE "MQ_PRODUCER_MSG" (
"MESSAGE_ID" VARCHAR2(36) NOT NULL ENABLE,
"FLOW_ID" VARCHAR2(36),
"CREATE_TIME" VARCHAR2(20),
........
CONSTRAINT "MQ_PRODUCER_MSG_OBPK_1653218990118045" PRIMARY KEY ("MESSAGE_ID", "CREATE_TIME")
) COMPRESS FOR ARCHIVE REPLICA_NUM = 4 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
partition by range(CREATE_TIME) subpartition by hash(MESSAGE_ID) subpartition template (
subpartition P0,
subpartition P1,
subpartition P2,
subpartition P3,
subpartition P4,
subpartition P5,
subpartition P6,
subpartition P7)
(partition MQ_PRODUCER_MSG_P1111 values less than ('2022-05-22 00:00:00'),
partition MQ_PRODUCER_MSG_P1112 values less than ('2022-05-23 00:00:00'),
partition MQ_PRODUCER_MSG_P1113 values less than ('2022-05-24 00:00:00'),
partition MQ_PRODUCER_MSG_P1114 values less than ('2022-05-25 00:00:00'),
partition MQ_PRODUCER_MSG_P1115 values less than ('2022-05-26 00:00:00'),
partition MQ_PRODUCER_MSG_P1116 values less than ('2022-05-27 00:00:00'),
partition MQ_PRODUCER_MSG_P1117 values less than ('2022-05-28 00:00:00'),
partition MQ_PRODUCER_MSG_P1118 values less than ('2022-05-29 00:00:00'),
partition MQ_PRODUCER_MSG_P1119 values less than ('2022-05-30 00:00:00'),
partition MQ_PRODUCER_MSG_P1120 values less than ('2022-05-31 00:00:00'),
partition MQ_PRODUCER_MSG_P1121 values less than ('2022-06-01 00:00:00'),
partition MQ_PRODUCER_MSG_P1122 values less than ('2022-06-02 00:00:00'),
partition MQ_PRODUCER_MSG_P1123 values less than ('2022-06-03 00:00:00'),
partition MQ_PRODUCER_MSG_P1124 values less than ('2022-06-04 00:00:00'),
partition MQ_PRODUCER_MSG_P1125 values less than ('2022-06-05 00:00:00'),
partition MQ_PRODUCER_MSG_P1126 values less than ('2022-06-06 00:00:00'),
partition MQ_PRODUCER_MSG_P1127 values less than ('2022-06-07 00:00:00'),
partition MQ_PRODUCER_MSG_P1128 values less than ('2022-06-08 00:00:00'),
partition MQ_PRODUCER_MSG_P1129 values less than ('2022-06-09 00:00:00'),
partition MQ_PRODUCER_MSG_P1130 values less than ('2022-06-10 00:00:00'),
partition MQ_PRODUCER_MSG_P1131 values less than ('2022-06-11 00:00:00'),
partition MQ_PRODUCER_MSG_P1132 values less than ('2022-06-12 00:00:00'),
partition MQ_PRODUCER_MSG_P1133 values less than ('2022-06-13 00:00:00'),
partition MQ_PRODUCER_MSG_P1134 values less than ('2022-06-14 00:00:00'),
partition MQ_PRODUCER_MSG_P1135 values less than ('2022-06-15 00:00:00'),
partition MQ_PRODUCER_MSG_P1136 values less than ('2022-06-16 00:00:00'),
partition MQ_PRODUCER_MSG_P1137 values less than ('2022-06-17 00:00:00'),
partition MQ_PRODUCER_MSG_P1138 values less than ('2022-06-18 00:00:00'),
partition MQ_PRODUCER_MSG_P1139 values less than ('2022-06-19 00:00:00'),
partition MQ_PRODUCER_MSG_P1140 values less than ('2022-06-20 00:00:00'),
partition MQ_PRODUCER_MSG_P1141 values less than ('2022-06-21 00:00:00'),
partition MQ_PRODUCER_MSG_P1142 values less than ('2022-06-22 00:00:00'),
partition MQ_PRODUCER_MSG_P1143 values less than ('2022-06-23 00:00:00'),
partition MQ_PRODUCER_MSG_P1144 values less than ('2022-06-24 00:00:00'),
partition MQ_PRODUCER_MSG_P1145 values less than ('2022-06-25 00:00:00'),
partition MQ_PRODUCER_MSG_P1146 values less than ('2022-06-26 00:00:00'),
partition MQ_PRODUCER_MSG_P1147 values less than ('2022-06-27 00:00:00'),
partition MQ_PRODUCER_MSG_P1148 values less than ('2022-06-28 00:00:00'),
partition MQ_PRODUCER_MSG_P1149 values less than ('2022-06-29 00:00:00'),
partition MQ_PRODUCER_MSG_P1150 values less than ('2022-06-30 00:00:00'))
看到SQL和表结构之后立马就明白为啥这条SQL效率这么低了,分区个数多达320的二级分区表,update操作竟然没有携带分区键,单条SQL会演变成跨320个分区的分布式事务,性能可想而知。定位到具体的原因之后,剩下的就是和开发人员讨论具体的解决办法了,考虑到这张表大部分的场景均携带了MESSAGE_ID,最终达成一致的结论是将这张表按照MESSAGE_ID作为一级分区,取消二级分区,所有SQL携带MESSAGE_ID进行操作,结果是这条SQL的执行耗时从16ms优化到了1ms以内。
Top 1的SQL优化完了之后下一轮收集__all_virtual_sql_audit时虽然其他SQL又会冒出来,但是我们已经把控住了优化的节奏:
1)每次只用关注Top 10的SQL,将精力集中在解决这10条SQL上,不断迭代,性能稳步提升;
2)把握住核心原则,建议每条SQL都携带分区键作为过滤条件之一,如果确实部分SQL无法携带分区键,则可以尝试通过全局索引来提升性能;
热点SQL优化-小表广播
Top SQL优化的第二种场景是虽然单条SQL的执行耗时不是问题,但是执行次数太多导致整体耗时排名仍然是Top级别,针对这种场景可以通过以下SQL来判断哪些SQL执行次数更多:
select
tenant_name,
database_name as db_name,
sql_id,
substr(replace(statement, '\n', ' '), 1, 100) as statement,
svr_ip,
type,
executions as executions,
avg_exe_usec as avg_exe_usec,
(avg_exe_usec * executions) as total_exe_usec,
elapsed_time,
slowest_exe_usec,
slow_count,
hit_count,
large_querys,
rows_processed
from
__all_virtual_plan_stat a
inner join __all_virtual_database b on a.db_id = b.database_id
inner join __all_tenant c on a.tenant_id = c.tenant_id
where
c.tenant_id > 1000
order by
executions desc,
a.tenant_id
limit
10;
接着就是针对搂出来的SQL进行分析定位,比如下面这条SQL:
SQL单次执行耗时1.7ms不算差,但是执行次数确是排名第二SQL的3.5倍,导致整体耗时仍然是Top 1,并且压测期间这张表Leader副本所在OBSever机器的CPU持续打满。
进一步排查发现这张表竟然只有一行数据:
与开发人员确认这张表的操作无法放入缓存,同时确定这张表的数据基本不怎么变更后,决定给这张表加上DUPLICATE_SCOPE = 'CLUSTER'的属性,将这张表设置为广播表,在租户的每台OBServer均复制这张表的数据来分担这张表的查询压力。
针对此类单条SQL耗时并不长,但是执行次数Top级别的表,一般的处理原则如下:
1)如果表比较大,超过百万行,建议进行分区,并将分区字段作为SQL的过滤条件之一。
2)如果表比较小,同时数据基本不怎么更新,强烈建议使用缓存,广播表是无奈之举。
3)如果表比较小,并且涉及到和其他表进行关联的场景,建议可以使用广播表。
跨机SQL优化-table group
经过多轮的慢SQL和热点SQL优化,跨机执行的SQL是接下来我们需要优化的重点方向,OB的__all_virtual___all_virtual_sql_audit和__all_virtual_plan_stat表均记录了远程执行和分布式执行的SQL及其耗时情况,plan_type=2或者plan_type=3表示非本地执行的SQL。
通过慢SQL优化中相同的办法搂取日志后发现,存在大量远程执行的SQL(type=2标识远程执行)。
进一步分析这些表的表结构发现,均是按照BRANCH字段分区,SQL过滤条件中也都携带了BRANCH字段。
BS_GL_TRAN_HIST_SOURCE | CREATE TABLE "BS_GL_TRAN_HIST_SOURCE" (
"BATCH_NO" VARCHAR2(50),
"TRADE_NO" VARCHAR2(50) CONSTRAINT "BS_GL_TRAN_HIST_SOURCE_OBNOTNULL_1657264698083983" NOT NULL ENABLE,
"REFERENCE" VARCHAR2(50),
........
CONSTRAINT "BS_GL_TRAN_HIST_SOURCE_OBPK_1657264698084138" PRIMARY KEY ("TRADE_NO", "BRANCH")
) COMPRESS FOR ARCHIVE REPLICA_NUM = 5 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
partition by hash(BRANCH)
(partition P0,
partition P1,
partition P2,
partition P3,
partition P4,
partition P5,
partition P6,
partition P7,
partition P8,
partition P9,
partition P10,
partition P11,
partition P12,
partition P13,
partition P14,
partition P15)
BS_GL_TRAN_HIST | CREATE TABLE "BS_GL_TRAN_HIST" (
"TRADE_NO" VARCHAR2(50) CONSTRAINT "BS_GL_TRAN_HIST_OBNOTNULL_1657264541289227" NOT NULL ENABLE,
"REFERENCE" VARCHAR2(50) CONSTRAINT "BS_GL_TRAN_HIST_OBNOTNULL_1657264541289250" NOT NULL ENABLE,
"BRANCH" VARCHAR2(20),
........
CONSTRAINT "BS_GL_TRAN_HIST_OBPK_1657264541289411" PRIMARY KEY ("TRADE_NO", "REFERENCE", "BRANCH")
) COMPRESS FOR ARCHIVE REPLICA_NUM = 5 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
partition by hash(BRANCH)
(partition P0,
partition P1,
partition P2,
partition P3,
partition P4,
partition P5,
partition P6,
partition P7,
partition P8,
partition P9,
partition P10,
partition P11,
partition P12,
partition P13,
partition P14,
partition P15)
和开发人员确认之后,发现BS_GL_POST、BS_GL_TRAN_HIST、BS_GL_TRAN_HIST_SOURCE、GL_ACCT、GL_BUSINESS_SITUATION、GL_POST_GROUP、GL_PROD_BAL、GL_PROD_GROUP这几张表的分区规则均可以设置为按照BRANCH 16分区,同时每笔交易中涉及的表均会携带BRANCH字段且内容一致,非常符合table group的优化条件。
优化前每张表各个分区的主副本均匀打散在各个OBServer上,导致了大量的跨机SQL。
mysql> select c.tenant_name, a.table_name, b.partition_id, b.svr_ip, b.zone from __all_virtual_table a inner join __all_virtual_meta_table b on a.table_id = b.table_id inner join __all_tenant c on a.tenant_id = c.tenant_id where role = 1 and c.tenant_name = 'FBSHX_GL' and table_name in ('BS_GL_TRAN_HIST_SOURCE', 'BS_GL_TRAN_HIST', 'GL_SYSTEM', 'GL_ERROR_TRAN_HIST', 'MQ_CONSUMER_REPEAT', 'MQ_CONSUMER_MSG') order by c.tenant_name, a.table_name, b.partition_id;
+----------------------+------------------------+--------------+-------------+-------+
| tenant_name | table_name | partition_id | svr_ip | zone |
+----------------------+------------------------+--------------+-------------+-------+
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 0 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 1 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 3 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 4 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 6 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 8 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 9 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 10 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 14 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST | 15 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 0 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 1 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 2 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 4 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 8 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 9 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 10 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 11 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 12 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 14 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | BS_GL_TRAN_HIST_SOURCE | 15 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 0 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 1 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 3 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 6 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | GL_ERROR_TRAN_HIST | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | GL_SYSTEM | 0 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | MQ_CONSUMER_MSG | 0 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | MQ_CONSUMER_REPEAT | 0 | xx.xx.xx.xx | zone2 |
+----------------------+------------------------+--------------+-------------+-------+
94 rows in set (1.23 sec)
按照业务场景调整完table group之后的分区主副本分布情况如下:
mysql> select d.tenant_name, c.tablegroup_name, a.table_name, b.partition_id, b.svr_ip, b.zone from __all_virtual_table a inner join __all_virtual_meta_table b on a.table_id = b.table_id inner join __all_virtual_tablegroup c on a.tablegroup_id = c.tablegroup_id inner join __all_tenant d on a.tenant_id = d.tenant_id where b.role = 1 and d.tenant_name = 'FBSHX_SYC_GL' and c.tablegroup_name != 'oceanbase' order by d.tenant_name, c.tablegroup_name, b.partition_id, a.table_name, b.zone, b.svr_ip;
+--------------+--------------------+------------------------+--------------+-------------+-------+
| tenant_name | tablegroup_name | table_name | partition_id | svr_ip | zone |
+--------------+--------------------+------------------------+--------------+-------------+-------+
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 0 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 1 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 2 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 3 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 4 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 5 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 6 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 7 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 8 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 9 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 10 | xx.xx.xx.xx | zone5 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 11 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 12 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 13 | xx.xx.xx.xx | zone2 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 14 | xx.xx.xx.xx | zone3 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_POST | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | BS_GL_TRAN_HIST_SOURCE | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_ACCT | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_BUSINESS_SITUATION | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_POST_GROUP | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_BAL | 15 | xx.xx.xx.xx | zone1 |
| FBSHX_SYC_GL | FBSHX_GL_BRANCH_NO | GL_PROD_GROUP | 15 | xx.xx.xx.xx | zone1 |
+--------------+--------------------+------------------------+--------------+-------------+-------+
128 rows in set (1.20 sec)
可以看到关联的几张表对应分区的主副本均在同一台OBServer上,整体跨机SQL的比例明显减少。
table group调整针对特定场景的优化还是很有帮助的,一般遵循的原则如下:
1)如果事务模型比较标准,同一笔交易中涉及的表分区字段、分区方法、分区个数完全一致,则可以考虑使用table group;
2)设置了table group之后,事务内基本所有的SQL过滤条件均需要携带分区键且分区键值相同,否则可能起不到优化的效果;
4. 小结
1.经过几轮的优化,单笔交易的耗时达标了,TPS也从不足1000 TPS提升至3500 TPS,取得了不错的成绩,但是仍然有很多细节的点需要继续优化。
2.数据库性能优化是一个持续不断的过程,先抓住核心主干进行优化,把握优化的节奏,既能取得比较好的效果,也能更快获得客户的认可。
3.数据库性能优化是开发人员和DBA齐头并进的过程,性能优化的过程中如果能够将数据库使用规范/应用开发规范在开发人员中普及开来,或者帮助客户建立起相应的规范,可以切实的帮助客户更好的使用咱们的数据库产品。