Oracle 并行计算 JOIN HINT

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介:

背景

Oracle的并行查询在处理JOIN时,可以通过调整优化器的HINT,指定优化器在处理JOIN时表的分布与关联策略。

例如,两张表JOIN时,如果是HASH或者MERGE JOIN,可以使用HASH分片,然后每个并行的计算单元处理一个数据分片,达到提高效率的目的。

又或者可以将某个较小的表进行broadcast,然后另一张表使用随机的分片策略,每个计算单元可以独立的运算。

这些算法与MPP如Greenplum的JOIN策略比较类似。

PostgreSQL 9.6也支持CPU的并行计算了,学习一下Oracle多年的并行计算是非常有必要的。

Oracle 并行计算 JOIN HINT

https://docs.oracle.com/cd/B12037_01/server.101/b10752/hintsref.htm

PQ_DISTRIBUTE

The PQ_DISTRIBUTE hint improves the performance of parallel join operations.

Do this by specifying how rows of joined tables should be distributed among producer and consumer query servers.

Using this hint overrides decisions the optimizer would normally make.

Use the EXPLAIN PLAN statement to identify the distribution chosen by the optimizer.

The optimizer ignores the distribution hint, if both tables are serial.

pq_distribute_hint::=

/*+ PQ_DISTRIBUTE ( [@queryblock] tablespec outer_distribution inner_distribution ) */  

where:

  • outer_distribution is the distribution for the outer table.
  • inner_distribution is the distribution for the inner table.

For a description of the queryblock syntax, see "Specifying a Query Block in a Hint".

For a description of the tablespec syntax, see "Specifying Global Table Hints".

There are six combinations for table distribution.

Only a subset of distribution method combinations for the joined tables is valid, as explained in Table 17-1.

Table 17-1 Distribution Hint Combinations

Distribution | Interpretation
-- | --
Hash, Hash | Maps the rows of each table to consumer query servers, using a hash function on the join keys. When mapping is complete, each query server performs the join between a pair of resulting partitions. This hint is recommended when the tables are comparable in size and the join operation is implemented by hash-join or sort merge join.
Broadcast, None | All rows of the outer table are broadcast to each query server. The inner table rows are randomly partitioned. This hint is recommended when the outer table is very small compared to the inner table. As a general rule, use the Broadcast/None hint when inner table size * number of query servers > outer table size.
None, Broadcast | All rows of the inner table are broadcast to each consumer query server. The outer table rows are randomly partitioned. This hint is recommended when the inner table is very small compared to the outer table. As a general rule, use the None/Broadcast hint when inner table size * number of query servers < outer table size.
Partition, None | Maps the rows of the outer table, using the partitioning of the inner table. The inner table must be partitioned on the join keys(当内表的JOIN字段正好是它的partition 字段,并且分区个数正好等于或接近并行度时建议使用。外表将依据内表的分区mapping rows。). This hint is recommended when the number of partitions of the outer table is equal to or nearly equal to a multiple of the number of query servers; for example, 14 partitions and 15 query servers. Note: The optimizer ignores this hint if the inner table is not partitioned or not equijoined on the partitioning key.
None, Partition | Maps the rows of the inner table using the partitioning of the outer table. The outer table must be partitioned on the join keys. This hint is recommended when the number of partitions of the outer table is equal to or nearly equal to a multiple of the number of query servers; for example, 14 partitions and 15 query servers. Note: The optimizer ignores this hint if the outer table is not partitioned or not equijoined on the partitioning key.
None, None | Each query server performs the join operation between a pair of matching partitions, one from each table. Both tables must be equipartitioned on the join keys(当内表和外表都在JOIN字段上进行了分区时,则内表和外表直接在匹配的分区上进行JOIN).

For example: Given two tables, r and s, that are joined using a hash-join, the following query contains a hint to use hash distribution:

SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s)*/ column_list  
  FROM r,s  
  WHERE r.c=s.c;  

To broadcast the outer table r, the query is:

SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list  
  FROM r,s  
  WHERE r.c=s.c;  

Count

相关文章
|
1月前
|
SQL Oracle 关系型数据库
避坑,Oracle中left join 与 (+) 的区别
避坑,Oracle中left join 与 (+) 的区别
|
8月前
|
SQL Oracle 关系型数据库
解决:Oracle数据库中Left join on 后面为null时匹配不上
解决:Oracle数据库中Left join on 后面为null时匹配不上
117 0
|
3月前
|
SQL Oracle 关系型数据库
Oracle查询优化-left join、right join、inner join、full join和逗号的区别
【1月更文挑战第5天】【1月更文挑战第13篇】实际查询时,多表联查是常规操作,但是连接方式有多种。
70 0
|
8月前
|
Oracle 关系型数据库 数据库
一篇文章带你了解Oracle 数据库中 CROSS JOIN(cross join) 语法的作用
一篇文章带你了解Oracle 数据库中 CROSS JOIN(cross join) 语法的作用
302 0
|
SQL Oracle 关系型数据库
oracle left join ... on 后跟条件不生效
oracle left join ... on 后跟条件,结果集过滤不生效
1013 0
|
Oracle 关系型数据库 Linux
Oracle 左连接(left join) 排序问题
项目环境:linux、tomcat8.5、SSM框架、oracle11g 项目中一个列表查询,使用了左连接(left join),类似这样: select * from A left join B on A.
2898 0
|
SQL Oracle 关系型数据库
Oracle中表连接方式(Nested Loop、Hash join)对于表访问次数的测试
介绍了sql多表连接的几种方式,如有不正确的地方请指正。
4252 0
|
Oracle 关系型数据库 PostgreSQL

相关实验场景

更多