Apache Doris Profile&Explain详解
一、简述
Apache Doris中运行EXPLAIN + SQL就可以得到SQL对应的Query Plan,再结合Apche Doris的Profile可以了解Doris是如何处理SQL语句,用于分析查询语句或是结构的性能瓶颈,从而帮助选择更好的索引和写出更优化的查询语句。
二、Plan分析
2.1 sql准备
tpcds query96.sql为例
explain -- explain graph 生成对应执行计划图表 select count(*) from store_sales ,household_demographics ,time_dim , store where ss_sold_time_sk = time_dim.t_time_sk and ss_hdemo_sk = household_demographics.hd_demo_sk and ss_store_sk = s_store_sk and time_dim.t_hour = 8 and time_dim.t_minute >= 30 and household_demographics.hd_dep_count = 5 and store.s_store_name = 'ese' order by count(*) limit 100;
2.2 explain结果分析
Query Plan可以分为逻辑执行计划(Logical Query Plan)和物理执行计划(Physical Query Plan),当前讲述的Query Plan默认指逻辑执行计划;tpcds query96.sql对应的Query Plan展示如下。
-- graph ┌───────────────┐ │[8: ResultSink]│ │[Fragment: 4] │ │RESULT SINK │ └───────────────┘ │ │ ┌─────────────┐ │[8: TOP-N] │ │[Fragment: 4]│ └─────────────┘ │ │ ┌────────────────────────────────┐ │[13: AGGREGATE (merge finalize)]│ │[Fragment: 4] │ └────────────────────────────────┘ │ │ ┌──────────────┐ │[12: EXCHANGE]│ │[Fragment: 4] │ └──────────────┘ │ │ ┌────────────────────┐ │[12: DataStreamSink]│ │[Fragment: 0] │ │STREAM DATA SINK │ │ EXCHANGE ID: 12 │ │ UNPARTITIONED │ └────────────────────┘ │ │ ┌─────────────────────────────────┐ │[7: AGGREGATE (update serialize)]│ │[Fragment: 0] │ └─────────────────────────────────┘ │ │ ┌───────────────────────────────┐ │[6: HASH JOIN] │ │[Fragment: 0] │ │join op: INNER JOIN (BROADCAST)│ └───────────────────────────────┘ ┌───────────┴─────────────────────────────────────┐ │ │ ┌───────────────────────────────┐ ┌──────────────┐ │[4: HASH JOIN] │ │[11: EXCHANGE]│ │[Fragment: 0] │ │[Fragment: 0] │ │join op: INNER JOIN (BROADCAST)│ └──────────────┘ └───────────────────────────────┘ │ ┌───────────────┴─────────────────────┐ │ │ │ ┌────────────────────┐ ┌───────────────────────────────┐ ┌──────────────┐ │[11: DataStreamSink]│ │[2: HASH JOIN] │ │[10: EXCHANGE]│ │[Fragment: 3] │ │[Fragment: 0] │ │[Fragment: 0] │ │STREAM DATA SINK │ │join op: INNER JOIN (BROADCAST)│ └──────────────┘ │ EXCHANGE ID: 11 │ └───────────────────────────────┘ │ │ UNPARTITIONED │ ┌─────────┴──────────┐ │ └────────────────────┘ │ │ ┌────────────────────┐ ┌┘ ┌──────────────────┐ ┌─────────────┐ │[10: DataStreamSink]│ │ │[0: OlapScanNode] │ │[9: EXCHANGE]│ │[Fragment: 2] │ ┌─────────────────┐ │[Fragment: 0] │ │[Fragment: 0]│ │STREAM DATA SINK │ │[5: OlapScanNode]│ │TABLE: store_sales│ └─────────────┘ │ EXCHANGE ID: 10 │ │[Fragment: 3] │ └──────────────────┘ │ │ UNPARTITIONED │ │TABLE: store │ │ └────────────────────┘ └─────────────────┘ ┌───────────────────┐ │ │[9: DataStreamSink]│ │ │[Fragment: 1] │ ┌─────────────────────────────┐ │STREAM DATA SINK │ │[3: OlapScanNode] │ │ EXCHANGE ID: 09 │ │[Fragment: 2] │ │ UNPARTITIONED │ │TABLE: household_demographics│ └───────────────────┘ └─────────────────────────────┘ │ │ ┌─────────────────┐ │[1: OlapScanNode]│ │[Fragment: 1] │ │TABLE: time_dim │ └─────────────────┘
-- 非graph PLAN FRAGMENT 0 OUTPUT EXPRS:<slot 11> <slot 10> count(*) PARTITION: UNPARTITIONED RESULT SINK 8:TOP-N | order by: <slot 11> <slot 10> count(*) ASC | offset: 0 | limit: 100 | 13:AGGREGATE (merge finalize) | output: count(<slot 10> count(*)) | group by: | cardinality=-1 | 12:EXCHANGE PLAN FRAGMENT 1 OUTPUT EXPRS: PARTITION: HASH_PARTITIONED: `default_cluster:tpcds`.`store_sales`.`ss_item_sk`, `default_cluster:tpcds`.`store_sales`.`ss_ticket_number` STREAM DATA SINK EXCHANGE ID: 12 UNPARTITIONED 7:AGGREGATE (update serialize) | output: count(*) | group by: | cardinality=1 | 6:HASH JOIN | join op: INNER JOIN (BROADCAST) | hash predicates: | colocate: false, reason: Tables are not in the same group | equal join conjunct: `ss_store_sk` = `s_store_sk` | runtime filters: RF000[in] <- `s_store_sk` | cardinality=2880403 | |----11:EXCHANGE | 4:HASH JOIN | join op: INNER JOIN (BROADCAST) | hash predicates: | colocate: false, reason: Tables are not in the same group | equal join conjunct: `ss_hdemo_sk` = `household_demographics`.`hd_demo_sk` | runtime filters: RF001[in] <- `household_demographics`.`hd_demo_sk` | cardinality=2880403 | |----10:EXCHANGE | 2:HASH JOIN | join op: INNER JOIN (BROADCAST) | hash predicates: | colocate: false, reason: Tables are not in the same group | equal join conjunct: `ss_sold_time_sk` = `time_dim`.`t_time_sk` | runtime filters: RF002[in] <- `time_dim`.`t_time_sk` | cardinality=2880403 | |----9:EXCHANGE | 0:OlapScanNode TABLE: store_sales PREAGGREGATION: OFF. Reason: conjunct on `ss_sold_time_sk` which is StorageEngine value column PREDICATES: `default_cluster:tpcds.store_sales`.`__DORIS_DELETE_SIGN__` = 0 runtime filters: RF000[in] -> `ss_store_sk`, RF001[in] -> `ss_hdemo_sk`, RF002[in] -> `ss_sold_time_sk` partitions=1/1 rollup: store_sales tabletRatio=3/3 tabletList=20968,20972,20976 cardinality=2880403 avgRowSize=67.95811 numNodes=3 PLAN FRAGMENT 2 OUTPUT EXPRS: PARTITION: HASH_PARTITIONED: `default_cluster:tpcds`.`store`.`s_store_sk` STREAM DATA SINK EXCHANGE ID: 11 UNPARTITIONED 5:OlapScanNode TABLE: store PREAGGREGATION: OFF. Reason: null PREDICATES: `store`.`s_store_name` = 'ese', `default_cluster:tpcds.store`.`__DORIS_DELETE_SIGN__` = 0 partitions=1/1 rollup: store tabletRatio=3/3 tabletList=20773,20777,20781 cardinality=23 avgRowSize=1798.8695 numNodes=3 PLAN FRAGMENT 3 OUTPUT EXPRS: PARTITION: HASH_PARTITIONED: `default_cluster:tpcds`.`household_demographics`.`hd_demo_sk` STREAM DATA SINK EXCHANGE ID: 10 UNPARTITIONED 3:OlapScanNode TABLE: household_demographics PREAGGREGATION: OFF. Reason: null PREDICATES: `household_demographics`.`hd_dep_count` = 5, `default_cluster:tpcds.household_demographics`.`__DORIS_DELETE_SIGN__` = 0 partitions=1/1 rollup: household_demographics tabletRatio=3/3 tabletList=20848,20852,20856 cardinality=14399 avgRowSize=2.8781166 numNodes=3 PLAN FRAGMENT 4 OUTPUT EXPRS: PARTITION: HASH_PARTITIONED: `default_cluster:tpcds`.`time_dim`.`t_time_sk` STREAM DATA SINK EXCHANGE ID: 09 UNPARTITIONED 1:OlapScanNode TABLE: time_dim PREAGGREGATION: OFF. Reason: null PREDICATES: `time_dim`.`t_hour` = 8, `time_dim`.`t_minute` >= 30, `default_cluster:tpcds.time_dim`.`__DORIS_DELETE_SIGN__` = 0 partitions=1/1 rollup: time_dim tabletRatio=3/3 tabletList=20713,20717,20721 cardinality=172799 avgRowSize=11.671202 numNodes=3