最佳实践—混合负载HTAP的实践和优化

本文涉及的产品
云原生数据库 PolarDB 分布式版,标准版 2核8GB
简介: 背景信息本文主要提供数据库上云后OLTP+OLAP一体化架构的最佳实践,既HTAP。主要面对的业务应用范围:混合型HTAP数据库需求:如ORACLE数据库改造上云,云上数据库方案选型;OLTP系统查询慢,存在分析型场景和瓶颈的客户;读写分离需求。

PolarDB-X HTAP架构更多信息,请参见混合负载HTAP

HTAP集群

您购买的PolarDB-X主实例,主要面向在线通用业务场景。如果业务针对同一份数据有分析、专注离线拖数、跑批等场景,您可以在PolarDB-X主实例上购买多个只读实例116.png业务如果有在线HTAP混合流量或者读写分离的需求,推荐使用集群地址。PolarDB-X内部会基于智能路由或者读写权重将部分流量转发给只读实例;业务上只有离线数据分析需求时,推荐使用只读地址,只读地址会直接访问只读实例,只读地址的流量会采用MPP加速。关于连接地址信息,请参见集群地址和只读地址

路由

智能路由

PolarDB-X优化器会基于代价分析出查询物理扫描行数、CPU、内存、IO、网络等核心资源消耗量,将请求区分为TP与AP负载。当您在集群地址上开启了智能路由,会主动识别SQL的工作负载类型来做路由,比如将识别为AP负载的流量路由给只读实例。您可以通过explain cost指令查看SQL工作负载类型的识别情况。例如以下查询,该查询涉及到物理扫描行数rowcount很小,计算资源(CPU&Memory)也消耗比较少,所以这个查询被识别为TP负载。


mysql> explain cost  select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN(sort="cnt ASC", offset=?2, fetch=?3): rowcount = 1.0, cumulative cost = value = 2.8765038E7, cpu = 37.0, memory = 64.0, io = 3.0, net = 5.75, id = 163602178                                                                                                                 |
|   Filter(condition="cnt > ?1"): rowcount = 1.0, cumulative cost = value = 2.8765026E7, cpu = 26.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602177                                                                                                                            |
|     HashAgg(group="k", cnt="COUNT()"): rowcount = 1.0, cumulative cost = value = 2.8765025E7, cpu = 25.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602171                                                                                                                     |
|       BKAJoin(condition="k = id", type="inner"): rowcount = 1.0, cumulative cost = value = 2.8765012E7, cpu = 12.0, memory = 18.0, io = 3.0, net = 5.75, id = 163602169                                                                                                           |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 2.3755003E7, cpu = 3.0, memory = 0.0, io = 1.0, net = 4.75, id = 163602164                                                                                                                             |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)"): rowcount = 1.0, cumulative cost = value = 2.3755002E7, cpu = 2.0, memory = 0.0, io = 1.0, net = 4.75, id = 163601451         |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 5003.0, cpu = 3.0, memory = 0.0, io = 1.0, net = 0.0, id = 163602167                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))"): rowcount = 1.0, cumulative cost = value = 5002.0, cpu = 2.0, memory = 0.0, io = 1.0, net = 0.0, id = 163601377 |                                                                                                                                                                                                                                                                    |
| WorkloadType: TP                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

工作负载的识别,对于HTAP路由至关重要。这里也允许您通过HINT WORKLOAD_TYPE指定工作负载。同样以上述查询为例,可以将查询负载强制指定为AP。


mysql> explain cost /*+TDDL:WORKLOAD_TYPE=AP*/ select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN(sort="cnt ASC", offset=?2, fetch=?3): rowcount = 1.0, cumulative cost = value = 2.8765038E7, cpu = 37.0, memory = 64.0, io = 3.0, net = 5.75, id = 163602178                                                                                                                 |
|   Filter(condition="cnt > ?1"): rowcount = 1.0, cumulative cost = value = 2.8765026E7, cpu = 26.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602177                                                                                                                            |
|     HashAgg(group="k", cnt="COUNT()"): rowcount = 1.0, cumulative cost = value = 2.8765025E7, cpu = 25.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602171                                                                                                                     |
|       BKAJoin(condition="k = id", type="inner"): rowcount = 1.0, cumulative cost = value = 2.8765012E7, cpu = 12.0, memory = 18.0, io = 3.0, net = 5.75, id = 163602169                                                                                                           |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 2.3755003E7, cpu = 3.0, memory = 0.0, io = 1.0, net = 4.75, id = 163602164                                                                                                                             |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)"): rowcount = 1.0, cumulative cost = value = 2.3755002E7, cpu = 2.0, memory = 0.0, io = 1.0, net = 4.75, id = 163601451         |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 5003.0, cpu = 3.0, memory = 0.0, io = 1.0, net = 0.0, id = 163602167                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))"): rowcount = 1.0, cumulative cost = value = 5002.0, cpu = 2.0, memory = 0.0, io = 1.0, net = 0.0, id = 163601377 |                                                                                                                                                                                                                                                                    |
| WorkloadType: AP                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

规则路由

除了基于代价的智能路由以外,我们也支持基于读写规则的路由。允许您在控制台参数管理上设置读写分离权重参数MASTER_READ_WEIGHT,默认值为100,可配置值区间[0, 100]。如果配置为Weight=60,意味着占60%的流量会继续在主实例执行,40%的剩余流量会路由到只读实例执行,如果只读实例有多个会进行自动分配。

智能路由和规则路由这两者关系是解耦的,具体关系请查看下表格。

智能路由规则 规则路由 (MASTER_READ_WEIGHT) 路由结果
开启 以代价的读写分离为主规则路由建议保持默认值为100
  • 事务和写操作流量,全部路由给主实例;
  • 识别为AP的查询流量,全部路由给只读实例;
  • 识别为TP的查询流量,按照(100-MASTER_READ_WEIGHT)路由给只读实例。
关闭 以规则的读写分离为主规则路由的可选范围:[0-100]
  • 事务和写操作流量,全部路由给主实例;
  • 识别为TP/AP的查询流量,一律按照(100-MASTER_READ_WEIGHT)路由给只读实例。

执行模式

目前PolarDB-X支持了三种执行模式:

  • 单机单线程(TP_LOCAL):查询过程中,是单线程计算,TP负载的查询涉及到的扫描行数比较少,往往会采样这种执行模式,比如基于主键的点查。
  • 单机并行(AP_LOCAL):查询过程中,会利用节点的多核资源做并行计算,您在没有购买只读实例的前提下,针对AP负载的查询,往往会采样这种执行模式,一般也称之为Parallel Query模式。
  • 多机并行(MPP):您若购买了只读实例,针对AP负载的查询,可以协调只读实例上多个节点的多核做分布式多机并行加速。

为了让您可以准确知道执行模式,我们在原有EXPLAIN指令上,扩展出了EXPLAIN PHYSICAL。例如以下查询,通过指令可以查看到当前查询采样的是MPP模式,此外还可以获取到每个执行片段的并发数。


mysql> explain physical select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 or
der by cnt limit 5, 10;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN                                                                                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ExecutorType: MPP                                                                                                                                                 |
| The Query's MaxConcurrentParallelism: 2                                                                                                                           |
| Fragment 1                                                                                                                                                        |
|     Shuffle Output layout: [BIGINT, BIGINT] Output layout: [BIGINT, BIGINT]                                                                                       |
|     Output partitioning: SINGLE [] Parallelism: 1                                                                                                                 |
|     TopN(sort="cnt ASC", offset=?2, fetch=?3)                                                                                                                     |
|   Filter(condition="cnt > ?1")                                                                                                                                    |
|     HashAgg(group="k", cnt="COUNT()")                                                                                                                             |
|       BKAJoin(condition="k = id", type="inner")                                                                                                                   |
|         RemoteSource(sourceFragmentIds=[0], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k))                                                             |
|         Gather(concurrent=true)                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))") |
| Fragment 0                                                                                                                                                        |
|     Shuffle Output layout: [BIGINT, BIGINT] Output layout: [BIGINT, BIGINT]                                                                                       |
|     Output partitioning: SINGLE [] Parallelism: 1 Splits: 16                                                                                                      |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)")                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+

工作负载和执行模式有一定的耦合关系,AP工作负载会采用多机并行计算。同样的也允许您通过HINT EXECUTOR_MODE指定执行模式。假如主实例空闲资源很多,可以考虑强制设置为单机或者多机并行模式来加速。


mysql> explain physical /*+TDDL:EXECUTOR_MODE=AP_LOCAL*/select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ExecutorMode: AP_LOCAL                                                                                                                                      |
| Fragment 0 dependency: [] parallelism: 4                                                                                                                    |
| BKAJoin(condition="k = id", type="inner")                                                                                                                   |
|   Gather(concurrent=true)                                                                                                                                   |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)")               |
|   Gather(concurrent=true)                                                                                                                                   |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))") |
| Fragment 1 dependency: [] parallelism: 8                                                                                                                    |
| LocalBuffer                                                                                                                                                 |
|   RemoteSource(sourceFragmentIds=[0], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k, INTEGER_UNSIGNED k0))                                        |
| Fragment 2 dependency: [0, 1] parallelism: 8                                                                                                                |
| Filter(condition="cnt > ?1")                                                                                                                                |
|   HashAgg(group="k", cnt="COUNT()")                                                                                                                         |
|     RemoteSource(sourceFragmentIds=[1], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k, INTEGER_UNSIGNED k0))                                      |
| Fragment 3 dependency: [0, 1] parallelism: 1                                                                                                                |
| LocalBuffer                                                                                                                                                 |
|   RemoteSource(sourceFragmentIds=[2], type=RecordType(INTEGER_UNSIGNED k, BIGINT cnt))                                                                      |
| Fragment 4 dependency: [2, 3] parallelism: 1                                                                                                                |
| TopN(sort="cnt ASC", offset=?2, fetch=?3)                                                                                                                   |
|   RemoteSource(sourceFragmentIds=[3], type=RecordType(INTEGER_UNSIGNED k, BIGINT cnt))                                                                      |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+

在多机并行MPP执行模式的并发度是根据物理扫描行数、实例规格和计算所涉及到表的分表数来计算出来的,整体的并行度要考虑高并发场景,所以并行度的计算会偏保守,您可以通过上述EXPLAIN PHYSICAL指令查看并行度。当然也同样支持HINT MPP_PARALLELISM强制指定并行度,


/*+TDDL:EXECUTOR_MODE=MPP MPP_PARALLELISM=8*/select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;

调度策略

假设您购买了多个只读实例并加入到集群地址中,您通过集群地址的查询SQL路由到只读实例的流量,会被均匀调度到只读实例多个节点上执行,调度会考虑各个节点的资源负载,确保各个节点的负载压力差不多。比如PolarDB-X会将只读实例延迟作为调度参考指标,避免将流量调度到延迟较大的只读实例上执行。115.png

业务如果有在线HTAP混合流量或者读写分离的需求,推荐使用集群地址。PolarDB-X内部会基于智能路由或者读写权重将部分流量转发给只读实例;业务上只有离线数据分析需求时,推荐使用只读地址,只读地址会直接访问只读实例,只读地址的流量会采用MPP加速。关于连接地址信息,请参见集群地址和只读地址

路由

智能路由

PolarDB-X优化器会基于代价分析出查询物理扫描行数、CPU、内存、IO、网络等核心资源消耗量,将请求区分为TP与AP负载。当您在集群地址上开启了智能路由,会主动识别SQL的工作负载类型来做路由,比如将识别为AP负载的流量路由给只读实例。您可以通过explain cost指令查看SQL工作负载类型的识别情况。例如以下查询,该查询涉及到物理扫描行数rowcount很小,计算资源(CPU&Memory)也消耗比较少,所以这个查询被识别为TP负载。


mysql> explain cost  select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN(sort="cnt ASC", offset=?2, fetch=?3): rowcount = 1.0, cumulative cost = value = 2.8765038E7, cpu = 37.0, memory = 64.0, io = 3.0, net = 5.75, id = 163602178                                                                                                                 |
|   Filter(condition="cnt > ?1"): rowcount = 1.0, cumulative cost = value = 2.8765026E7, cpu = 26.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602177                                                                                                                            |
|     HashAgg(group="k", cnt="COUNT()"): rowcount = 1.0, cumulative cost = value = 2.8765025E7, cpu = 25.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602171                                                                                                                     |
|       BKAJoin(condition="k = id", type="inner"): rowcount = 1.0, cumulative cost = value = 2.8765012E7, cpu = 12.0, memory = 18.0, io = 3.0, net = 5.75, id = 163602169                                                                                                           |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 2.3755003E7, cpu = 3.0, memory = 0.0, io = 1.0, net = 4.75, id = 163602164                                                                                                                             |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)"): rowcount = 1.0, cumulative cost = value = 2.3755002E7, cpu = 2.0, memory = 0.0, io = 1.0, net = 4.75, id = 163601451         |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 5003.0, cpu = 3.0, memory = 0.0, io = 1.0, net = 0.0, id = 163602167                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))"): rowcount = 1.0, cumulative cost = value = 5002.0, cpu = 2.0, memory = 0.0, io = 1.0, net = 0.0, id = 163601377 |                                                                                                                                                                                                                                                                    |
| WorkloadType: TP                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

工作负载的识别,对于HTAP路由至关重要。这里也允许您通过HINT WORKLOAD_TYPE指定工作负载。同样以上述查询为例,可以将查询负载强制指定为AP。


mysql> explain cost /*+TDDL:WORKLOAD_TYPE=AP*/ select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN(sort="cnt ASC", offset=?2, fetch=?3): rowcount = 1.0, cumulative cost = value = 2.8765038E7, cpu = 37.0, memory = 64.0, io = 3.0, net = 5.75, id = 163602178                                                                                                                 |
|   Filter(condition="cnt > ?1"): rowcount = 1.0, cumulative cost = value = 2.8765026E7, cpu = 26.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602177                                                                                                                            |
|     HashAgg(group="k", cnt="COUNT()"): rowcount = 1.0, cumulative cost = value = 2.8765025E7, cpu = 25.0, memory = 47.0, io = 3.0, net = 5.75, id = 163602171                                                                                                                     |
|       BKAJoin(condition="k = id", type="inner"): rowcount = 1.0, cumulative cost = value = 2.8765012E7, cpu = 12.0, memory = 18.0, io = 3.0, net = 5.75, id = 163602169                                                                                                           |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 2.3755003E7, cpu = 3.0, memory = 0.0, io = 1.0, net = 4.75, id = 163602164                                                                                                                             |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)"): rowcount = 1.0, cumulative cost = value = 2.3755002E7, cpu = 2.0, memory = 0.0, io = 1.0, net = 4.75, id = 163601451         |
|         Gather(concurrent=true): rowcount = 1.0, cumulative cost = value = 5003.0, cpu = 3.0, memory = 0.0, io = 1.0, net = 0.0, id = 163602167                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))"): rowcount = 1.0, cumulative cost = value = 5002.0, cpu = 2.0, memory = 0.0, io = 1.0, net = 0.0, id = 163601377 |                                                                                                                                                                                                                                                                    |
| WorkloadType: AP                                                                                                                                                                                                                                                                  |                                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

规则路由

除了基于代价的智能路由以外,我们也支持基于读写规则的路由。允许您在控制台参数管理上设置读写分离权重参数MASTER_READ_WEIGHT,默认值为100,可配置值区间[0, 100]。如果配置为Weight=60,意味着占60%的流量会继续在主实例执行,40%的剩余流量会路由到只读实例执行,如果只读实例有多个会进行自动分配。

智能路由和规则路由这两者关系是解耦的,具体关系请查看下表格。

智能路由规则 规则路由 (MASTER_READ_WEIGHT) 路由结果
开启 以代价的读写分离为主规则路由建议保持默认值为100
  • 事务和写操作流量,全部路由给主实例;
  • 识别为AP的查询流量,全部路由给只读实例;
  • 识别为TP的查询流量,按照(100-MASTER_READ_WEIGHT)路由给只读实例。
关闭 以规则的读写分离为主规则路由的可选范围:[0-100]
  • 事务和写操作流量,全部路由给主实例;
  • 识别为TP/AP的查询流量,一律按照(100-MASTER_READ_WEIGHT)路由给只读实例。

执行模式

目前PolarDB-X支持了三种执行模式:

  • 单机单线程(TP_LOCAL):查询过程中,是单线程计算,TP负载的查询涉及到的扫描行数比较少,往往会采样这种执行模式,比如基于主键的点查。
  • 单机并行(AP_LOCAL):查询过程中,会利用节点的多核资源做并行计算,您在没有购买只读实例的前提下,针对AP负载的查询,往往会采样这种执行模式,一般也称之为Parallel Query模式。
  • 多机并行(MPP):您若购买了只读实例,针对AP负载的查询,可以协调只读实例上多个节点的多核做分布式多机并行加速。

为了让您可以准确知道执行模式,我们在原有EXPLAIN指令上,扩展出了EXPLAIN PHYSICAL。例如以下查询,通过指令可以查看到当前查询采样的是MPP模式,此外还可以获取到每个执行片段的并发数。


mysql> explain physical select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 or
der by cnt limit 5, 10;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN                                                                                                                                                              |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ExecutorType: MPP                                                                                                                                                 |
| The Query's MaxConcurrentParallelism: 2                                                                                                                           |
| Fragment 1                                                                                                                                                        |
|     Shuffle Output layout: [BIGINT, BIGINT] Output layout: [BIGINT, BIGINT]                                                                                       |
|     Output partitioning: SINGLE [] Parallelism: 1                                                                                                                 |
|     TopN(sort="cnt ASC", offset=?2, fetch=?3)                                                                                                                     |
|   Filter(condition="cnt > ?1")                                                                                                                                    |
|     HashAgg(group="k", cnt="COUNT()")                                                                                                                             |
|       BKAJoin(condition="k = id", type="inner")                                                                                                                   |
|         RemoteSource(sourceFragmentIds=[0], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k))                                                             |
|         Gather(concurrent=true)                                                                                                                                   |
|           LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))") |
| Fragment 0                                                                                                                                                        |
|     Shuffle Output layout: [BIGINT, BIGINT] Output layout: [BIGINT, BIGINT]                                                                                       |
|     Output partitioning: SINGLE [] Parallelism: 1 Splits: 16                                                                                                      |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)")                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+

工作负载和执行模式有一定的耦合关系,AP工作负载会采用多机并行计算。同样的也允许您通过HINT EXECUTOR_MODE指定执行模式。假如主实例空闲资源很多,可以考虑强制设置为单机或者多机并行模式来加速。


mysql> explain physical /*+TDDL:EXECUTOR_MODE=AP_LOCAL*/select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;                                                                                                                                                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ExecutorMode: AP_LOCAL                                                                                                                                      |
| Fragment 0 dependency: [] parallelism: 4                                                                                                                    |
| BKAJoin(condition="k = id", type="inner")                                                                                                                   |
|   Gather(concurrent=true)                                                                                                                                   |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `id`, `k` FROM `sbtest1` AS `sbtest1` WHERE (`id` > ?)")               |
|   Gather(concurrent=true)                                                                                                                                   |
|     LogicalView(tables="[000000-000003].sbtest1_[00-15]", shardCount=16, sql="SELECT `k` FROM `sbtest1` AS `sbtest1` WHERE ((`k` > ?) AND (`k` IN (...)))") |
| Fragment 1 dependency: [] parallelism: 8                                                                                                                    |
| LocalBuffer                                                                                                                                                 |
|   RemoteSource(sourceFragmentIds=[0], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k, INTEGER_UNSIGNED k0))                                        |
| Fragment 2 dependency: [0, 1] parallelism: 8                                                                                                                |
| Filter(condition="cnt > ?1")                                                                                                                                |
|   HashAgg(group="k", cnt="COUNT()")                                                                                                                         |
|     RemoteSource(sourceFragmentIds=[1], type=RecordType(INTEGER_UNSIGNED id, INTEGER_UNSIGNED k, INTEGER_UNSIGNED k0))                                      |
| Fragment 3 dependency: [0, 1] parallelism: 1                                                                                                                |
| LocalBuffer                                                                                                                                                 |
|   RemoteSource(sourceFragmentIds=[2], type=RecordType(INTEGER_UNSIGNED k, BIGINT cnt))                                                                      |
| Fragment 4 dependency: [2, 3] parallelism: 1                                                                                                                |
| TopN(sort="cnt ASC", offset=?2, fetch=?3)                                                                                                                   |
|   RemoteSource(sourceFragmentIds=[3], type=RecordType(INTEGER_UNSIGNED k, BIGINT cnt))                                                                      |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+

在多机并行MPP执行模式的并发度是根据物理扫描行数、实例规格和计算所涉及到表的分表数来计算出来的,整体的并行度要考虑高并发场景,所以并行度的计算会偏保守,您可以通过上述EXPLAIN PHYSICAL指令查看并行度。当然也同样支持HINT MPP_PARALLELISM强制指定并行度,


/*+TDDL:EXECUTOR_MODE=MPP MPP_PARALLELISM=8*/select a.k, count(*) cnt from sbtest1 a, sbtest1 b where a.id = b.k and a.id > 1000 group by k having cnt > 1300 order by cnt limit 5, 10;

调度策略

假设您购买了多个只读实例并加入到集群地址中,您通过集群地址的查询SQL路由到只读实例的流量,会被均匀调度到只读实例多个节点上执行,调度会考虑各个节点的资源负载,确保各个节点的负载压力差不多。比如PolarDB-X会将只读实例延迟作为调度参考指标,避免将流量调度到延迟较大的只读实例上执行。

相关实践学习
快速体验PolarDB开源数据库
本实验环境已内置PostgreSQL数据库以及PolarDB开源数据库:PolarDB PostgreSQL版和PolarDB分布式版,支持一键拉起使用,方便各位开发者学习使用。
相关文章
|
5月前
|
测试技术
优化if-else的11种方案
优雅编码不仅提升程序效率,也增进代码可读性与维护性。通过早返回减少嵌套逻辑、运用三元运算符简化条件判断、采用`switch-case`优化多分支结构、实施策略模式灵活应对不同情境、利用查找表快速定位处理方式、封装函数明确职责划分、应用命令模式解耦操作与调用、引入状态模式管理复杂状态变化、重构条件表达式以增强清晰度、运用断言确保前提条件、及合理异常处理等十大技巧,使代码更加精炼与优雅。
98 4
优化if-else的11种方案
|
8月前
|
存储 监控 Cloud Native
如何通过持续测试和调整来提高OLAP系统的性能和可扩展性?
【5月更文挑战第14天】如何通过持续测试和调整来提高OLAP系统的性能和可扩展性?
80 2
|
5月前
|
存储 缓存 安全
MPP架构数据仓库使用问题之DADI相比其他方案,在资源使用上有什么优势
MPP架构数据仓库使用问题之DADI相比其他方案,在资源使用上有什么优势
|
6月前
|
存储 关系型数据库 数据库
优化数据库性能的关键技术与实践
数据库作为现代应用架构的核心组成部分,其性能优化直接关系到系统整体的稳定性和效率。本文探讨了提升数据库性能的关键技术和实际应用,涵盖了索引优化、查询优化、存储引擎选择以及硬件优化等方面,旨在帮助开发者有效提升应用程序的响应速度和吞吐量。
|
7月前
|
存储 NoSQL 关系型数据库
PolarDB产品使用问题之如何充分利用好产品的性能,提升并发处理能力
PolarDB产品使用合集涵盖了从创建与管理、数据管理、性能优化与诊断、安全与合规到生态与集成、运维与支持等全方位的功能和服务,旨在帮助企业轻松构建高可用、高性能且易于管理的数据库环境,满足不同业务场景的需求。用户可以通过阿里云控制台、API、SDK等方式便捷地使用这些功能,实现数据库的高效运维与持续优化。
|
6月前
|
固态存储 测试技术 数据库
最新测试揭秘:TDengine 线性扩展能力能否满足你的性能需求?
**摘要:** TDengine 的线性扩展能力通过CPU和磁盘测试得以验证。在CPU测试中,使用TDengine V3.3.0.0和taosBenchmark,随着CPU核数从4增至12,写入性能线性提升,每个CPU增加对应约50W条/秒的提升,保持CPU在瓶颈状态。磁盘IO测试中,使用低速机械盘,增加磁盘数量导致写入性能成比例增长,充分利用新增磁盘IO。测试结果表明,无论CPU还是磁盘扩展,TDengine都能有效利用资源,展示出出色的线性扩展能力。
74 0
|
8月前
|
存储 监控 数据挖掘
如何评估并优化OLAP系统的性能和可扩展性?
【5月更文挑战第14天】如何评估并优化OLAP系统的性能和可扩展性?
81 0
|
数据采集 算法 编译器
倚天710规模化应用 - 性能优化 -自动反馈优化分析与实践
编译器优化分成静态优化与动态优化,静态优化指传统编译器gcc/llvm时,增加的优化等级,如O1,O2,O3,Ofast,此时,编译器会依据编译优化等级增加一些优化算法,如函数inline、循环展开以及分支静态预测等等。一般情况下,优化等级越高,编译器做的优化越多,性能会更会好。在阿里生产环境中,单纯依赖于静态优化,并不能达到程序运行流畅目的,通过分析CPU硬件取指令、执行指令,往往会出现一些分支预测失败导致iCacheMiss率高的场景,限制了程序的性能进一步提升。基于此,业务引入了动态反馈优化工具,依据生产环境的实际运行数据,反哺指导编译器对程序代码进一步调整编译优化策略,提高分支预准确率
|
SQL 并行计算 Oracle
【笔记】最佳实践—混合负载HTAP的实践和优化
背景信息 本文主要提供数据库上云后OLTP+OLAP一体化架构的最佳实践,既HTAP。主要面对的业务应用范围: 混合型HTAP数据库需求:如ORACLE数据库改造上云,云上数据库方案选型; OLTP系统查询慢,存在分析型场景和瓶颈的客户; 读写分离需求。
271 0
【笔记】最佳实践—混合负载HTAP的实践和优化
|
SQL 并行计算 Oracle
最佳实践—混合负载HTAP的实践和优化
背景信息 本文主要提供数据库上云后OLTP+OLAP一体化架构的最佳实践,既HTAP。主要面对的业务应用范围: 混合型HTAP数据库需求:如ORACLE数据库改造上云,云上数据库方案选型; OLTP系统查询慢,存在分析型场景和瓶颈的客户; 读写分离需求。
667 0
最佳实践—混合负载HTAP的实践和优化