开发者社区 问答 正文

[查询优化] PG 分区表日期范围查询索引问题 now() vs 日期字符串

PG 版本 12.5 采用 INHERITS 创建基于时间(按月)的分区表,并给每个分表的时间列均建立了索引,目前已有 12 个月的分表。 需求及问题:通过视图获取今日数据,但条件中使用 now() 会扫描所有的分表索引,而日期字符串仅会扫描对应日期范围的表及索引,如何进行修改来实现这一需求?谢谢

查询语句

SELECT *
   FROM table w
  WHERE ((w."time" > ((now())::date + '00:00:00'::time without time zone)) AND (w."time" < ((now())::date + '23:59:59'::time without time zone)))

执行计划

Append  (cost=0.00..114.54 rows=1123 width=24)
  ->  Seq Scan on table w  (cost=0.00..0.00 rows=1 width=24)
        Filter: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202102_index_time on table_202102 w_1  (cost=0.29..2.51 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202103_index_time on table_202103 w_2  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202009_index_time on table_202009 w_3  (cost=0.16..2.38 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202010_index_time on table_202010 w_4  (cost=0.29..2.51 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202011_index_time on table_202011 w_5  (cost=0.17..2.39 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202012_index_time on table_202012 w_6  (cost=0.29..2.51 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202101_index_time on table_202101 w_7  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202104_index_time on table_202104 w_8  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202105_index_time on table_202105 w_9  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202106_index_time on table_202106 w_10  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202107_index_time on table_202107 w_11  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202108_index_time on table_202108 w_12  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202109_index_time on table_202109 w_13  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202110_index_time on table_202110 w_14  (cost=0.30..2.52 rows=1 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))
  ->  Index Scan using table_202111_index_time on table_202111 w_15  (cost=0.30..73.91 rows=1108 width=24)
        Index Cond: (("time" > ((now())::date + '00:00:00'::time without time zone)) AND ("time" < ((now())::date + '23:59:59'::time without time zone)))

查询语句

explain SELECT *
   FROM table w
  WHERE w.time>= '2021-11-17';

执行计划

Append  (cost=0.00..74.46 rows=1154 width=24)
  ->  Seq Scan on table w  (cost=0.00..0.00 rows=1 width=24)
        Filter: ("time" >= '2021-11-17 00:00:00'::timestamp without time zone)
  ->  Index Scan using table_202111_index_time on table_202111 w_1  (cost=0.29..68.69 rows=1153 width=24)
        Index Cond: ("time" >= '2021-11-17 00:00:00'::timestamp without time zone)

查询语句修改为以下,还是会扫描所有的表索引

explain ANALYZE  SELECT *
   FROM table w
  WHERE w.time >= CURRENT_DATE;

求助,如何优化索引以及查询语句?谢谢!

展开
收起
1584944066300731 2021-11-17 12:59:55 1019 分享 版权
阿里云 AI 助理回答

针对您的问题,您希望优化查询语句以减少扫描所有分表索引的情况。使用now()函数确实会导致PostgreSQL无法精确判断数据所在的分区,因为它在执行计划生成阶段无法预知具体的日期时间值,从而导致对所有分区进行扫描。

一种解决方案是创建一个函数来返回当前日期的开始时刻(即今天的零点),然后在查询中使用这个函数,这样可以让PostgreSQL利用分区剪枝特性仅扫描相关的分区。下面是一个示例:

  1. 创建函数:定义一个函数来返回当前日期的起始时间戳。
CREATE OR REPLACE FUNCTION current_day_start() RETURNS timestamp without time zone AS $$
  SELECT date_trunc('day', now());
$$ LANGUAGE SQL STABLE;

这里的STABLE表示该函数的结果对于同一个表的多次读取是稳定的,这有助于优化器做出更好的决策。

  1. 修改查询语句:在查询中使用新创建的函数替换直接使用now()CURRENT_DATE
EXPLAIN ANALYZE
SELECT *
FROM table w
WHERE w.time >= current_day_start();

通过这种方式,PostgreSQL可以在执行计划阶段识别出只有包含当前日期数据的分区需要被扫描,从而避免了全表扫描所有索引的问题。

请注意,为了确保查询性能,还需要确保分区键上的索引是最优的,并且定期分析表以更新统计信息,这可以通过运行ANALYZE命令或者设置自动 vacuum 和 analyze 的策略来实现。

此外,如果您使用的是阿里云的RDS PostgreSQL服务,可以进一步利用其管理功能,比如自动维护、监控和报警系统,来帮助管理和优化数据库性能。

有帮助
无帮助
AI 助理回答生成答案可能存在不准确,仅供参考
0 条回答
写回答
取消 提交回答
问答分类:
问答地址: