标签
PostgreSQL , CTE , materialized , not materialized , push down
背景
PostgreSQL with 语法,能跑非常复杂的SQL逻辑,包括递归,多语句物化计算等。
在12以前的版本中,WITH中的每一个CTE(common table express),都是直接进行物化的,也就是说外层的条件不会推到CTE(物化节点)里面去。
这么做对于insert,update,delete的CTE以及递归WITH语句,都是稀疏平常的。但是对于select CTE,外面的条件推到CTE里面,可能能够大幅降低扫描。
因此PG 12开始,提供了用户选择
with NOT MATERIALIZED (不使用物化,允许外面条件推进去)
with MATERIALIZED (使用物化)
Allow user control of CTE materialization, and change the default behavior.
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it). This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.
Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query. Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.
Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free. By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED. Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.
Andreas Karlsson, Andrew Gierth, David Fetter
Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
例子
在CTE中使用NOT MATERIALIZED,表示这个CTE不使用物化,外面的条件可以推到CTE中。
In particular, if there's an index on key, it will probably be used to fetch just the rows having key = 123. On the other hand, in
WITH w AS (
SELECT * FROM big_table
)
SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref
WHERE w2.key = 123;
the WITH query will be materialized, producing a temporary copy of big_table that is then joined with itself — without benefit of any index.
This query will be executed much more efficiently if written as:
WITH w AS NOT MATERIALIZED (
SELECT * FROM big_table
)
SELECT * FROM w AS w1 JOIN w AS w2 ON w1.key = w2.ref
WHERE w2.key = 123;
参考
https://www.postgresql.org/docs/devel/queries-with.html