PostgreSQL 空间类型统计信息(Statistics, Cardinality, Selectivity, Estimate)不准确导致SQL执行计划不准(包含、相交查询)的优化实践-阿里云开发者社区

create extension postgis;  
create table test (id int , info text, pos geometry);  
insert into test select generate_series(1,10000000), 'test', st_setsrid(st_makepoint(110+random()*20-10, 70+random()*10-5), 4326);  
create index idx_test_1 on test using gist(pos);

2、收集统计信息

vacuum analyze test;

你会发现空间类型的统计信息非常有限(没有高频词，柱状图，相关性等内容)：

postgres=# select * from pg_stats where tablename='test' and attname='pos';  
 schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram   
------------+-----------+---------+-----------+-----------+-----------+------------+------------------+-------------------+------------------+-------------+-------------------+------------------------+----------------------  
 public     | test      | pos     | f         |         0 |        32 |         -1 |                  |                   |                  |             |                   |                        |   
(1 row)

3、查看执行计划，查看评估出来的记录数。

空间类型选择性使用如下硬编码0.001。

src/backend/utils/adt/geo_selfuncs.c

/*
 *      contsel -- How likely is a box to contain (be contained by) a given box?
 *
 * This is a tighter constraint than "overlap", so produce a smaller
 * estimate than areasel does.
 */

Datum
contsel(PG_FUNCTION_ARGS)
{
        PG_RETURN_FLOAT8(0.001);
}

不管输入什么空间条件，评估得到的记录数都是千分之一：10000条

条件1：
postgres=# explain select * from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326) , pos);  
                                                  QUERY PLAN                                                                                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on test  (cost=149.95..12834.93 rows=3333 width=41)  
   Recheck Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
   Filter: _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos)  
   ->  Bitmap Index Scan on idx_test_1  (cost=0.00..149.12 rows=10000 width=0)  
         Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
(5 rows)  
  
条件2：
postgres=# explain select * from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);  
                                                 QUERY PLAN                                                                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on test  (cost=149.95..12834.93 rows=3333 width=41)  
   Recheck Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
   Filter: _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos)  
   ->  Bitmap Index Scan on idx_test_1  (cost=0.00..149.12 rows=10000 width=0)  
         Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
(5 rows)

4、那么实际上的记录数是多少呢？

条件1：0条。

postgres=# explain analyze select count(*) from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326) , pos);  
                               QUERY PLAN    
-------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12843.26..12843.27 rows=1 width=8) (actual time=0.015..0.015 rows=1 loops=1)  
   ->  Bitmap Heap Scan on test  (cost=149.95..12834.93 rows=3333 width=0) (actual time=0.012..0.012 rows=0 loops=1)  
         Recheck Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
         Filter: _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos)  
         ->  Bitmap Index Scan on idx_test_1  (cost=0.00..149.12 rows=10000 width=0) (actual time=0.010..0.010 rows=0 loops=1)  
               Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry ~   
pos)  
 Planning time: 0.170 ms  
 Execution time: 0.061 ms  
(8 rows)

条件2：5001107条。

postgres=# explain analyze select count(*) from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);  
                                QUERY PLAN    
----------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=12843.26..12843.27 rows=1 width=8) (actual time=5057.822..5057.822 rows=1 loops=1)  
   ->  Bitmap Heap Scan on test  (cost=149.95..12834.93 rows=3333 width=0) (actual time=507.744..4537.564 rows=5001107 loops=1)  
         Recheck Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
         Rows Removed by Index Recheck: 1981986  
         Filter: _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry,  
 pos)  
         Heap Blocks: exact=50167 lossy=33167  
         ->  Bitmap Index Scan on idx_test_1  (cost=0.00..149.12 rows=10000 width=0) (actual time=499.627..499.627 rows=5001107 loops=1)  
               Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~   
pos)  
 Planning time: 0.179 ms  
 Execution time: 5057.882 ms  
(10 rows)

5、空间类型的选择性不准确，导致的问题，不管什么情况都走空间索引。

5.1 大范围搜索，使用索引反而更慢，如下。

postgres=# explain analyze select count(*) from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);  
                                        QUERY PLAN  
-------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=10539.13..10539.14 rows=1 width=8) (actual time=1079.698..1079.699 rows=1 loops=1)  
   ->  Gather  (cost=10539.10..10539.11 rows=8 width=8) (actual time=1079.655..1079.693 rows=9 loops=1)  
         Workers Planned: 8  
         Workers Launched: 8  
         ->  Partial Aggregate  (cost=10539.10..10539.11 rows=1 width=8) (actual time=1049.984..1049.984 rows=1 loops=9)  
               ->  Parallel Bitmap Heap Scan on test  (cost=149.95..10538.06 rows=417 width=0) (actual time=483.733..991.774 rows=555679 loops=9)  
                     Recheck Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geo  metry ~ pos)  
                     Rows Removed by Index Recheck: 220221  
                     Filter: _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos)  
                     Heap Blocks: exact=5908 lossy=3936  
                     ->  Bitmap Index Scan on idx_test_1  (cost=0.00..149.12 rows=10000 width=0) (actual time=502.790..502.790 rows=5001107 loops=1)  
                           Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
 Planning time: 0.171 ms  
 Execution time: 1087.890 ms  
(14 rows)  
  
  
postgres=# explain analyze select count(*) from test where st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);  
                                  QUERY PLAN           
-------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=411456.73..411456.74 rows=1 width=8) (actual time=674.853..674.853 rows=1 loops=1)  
   ->  Gather  (cost=411456.70..411456.71 rows=8 width=8) (actual time=674.793..674.848 rows=9 loops=1)  
         Workers Planned: 8  
         Workers Launched: 8  
         ->  Partial Aggregate  (cost=411456.70..411456.71 rows=1 width=8) (actual time=644.627..644.627 rows=1 loops=9)  
               ->  Parallel Seq Scan on test  (cost=0.00..411455.65 rows=417 width=0) (actual time=0.045..586.545 rows=555679 loops=9)  
                     Filter: (('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry   ~ pos) AND _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos)  )  
                     Rows Removed by Filter: 555433  
 Planning time: 0.130 ms  
 Execution time: 683.011 ms  
(10 rows)

优化方法

首先我们知道空间包含，实际上是通过类似r-tree来从大到小进行收敛的，索引结构如下：

《PostgreSQL 黑科技 - 空间聚集存储, 内窥GIN, GiST, SP-GiST索引》

《通过空间思想理解GiST索引的构造》

因此，对于“平面或3D空间点”类型，我们可以利用x,y,z来收集BOUND BOX的边界统计信息，使用表达式索引即可得到。

下面的方法适合POINT geometry类型。

1、对point类型，创建x,y表达式索引

create index idx_test_2 on test (st_x(pos));  
create index idx_test_3 on test (st_y(pos));

2、收集统计信息

vacuum analyze test;

3、现在，表达式索引的统计信息有了（边界，柱状图，高频词等）.

postgres=# select * from pg_stats where tablename='idx_test_2' ;  
 schemaname | tablename  | attname | inherited | null_frac | avg_width | n_distinct |  most_common_vals  | most_common_freqs |                                                                                                                 
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                        histogram_bounds                                                                                                                                                                                                       
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                                                                                                                                                                               | correlation | most_common_elems | most_common_elem_freqs | e  
lem_count_histogram   
------------+------------+---------+-----------+-----------+-----------+------------+--------------------+-------------------+---------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-------------------+------------------------+--  
--------------------  
 public     | idx_test_2 | st_x    | f         |         0 |         8 |  -0.978292 | {100.210125753656} | {6.66667e-05}     | {100.001683048904,100.198292508721,100.394482128322,100.614011939615,100.833150474355,101.039264379069,101.233  
114106581,101.457025809214,101.663421457633,101.865276163444,102.060301089659,102.248072810471,102.453578868881,102.656150059775,102.860928429291,103.057968365029,103.260449869558,103.457426037639,103.632223745808,103.85178135708,104.049  
411769956,104.26340050064,104.474113518372,104.658453594893,104.85337683931,105.054327072576,105.277718435973,105.484521845356,105.703673372045,105.907790735364,106.115896645933,106.314396839589,106.519666947424,106.724081514403,106.9380  
68913296,107.144253859296,107.326071513817,107.518919138238,107.721077715978,107.916374253109,108.123016441241,108.360300129279,108.551258808002,108.714984534308,108.903268156573,109.091228945181,109.277053307742,109.48098176159,109.6832  
09387586,109.853478074074,110.044501451775,110.241627292708,110.45505293645,110.652585113421,110.851763477549,111.036027139053,111.231312695891,111.423934968188,111.620921371505,111.847442938015,112.031054906547,112.234003236517,112.4332  
3944509,112.63793184422,112.834744984284,113.041293732822,113.23037719354,113.432592023164,113.619993180037,113.809020379558,114.014157289639,114.208164261654,114.409972019494,114.612494371831,114.827286116779,115.010247323662,115.208249  
371499,115.397510435432,115.607831152156,115.814645215869,116.007154844701,116.198594048619,116.39090036042,116.600214680657,116.819634130225,117.008039858192,117.199446037412,117.389995325357,117.60263632983,117.81475706026,118.02150100  
4696,118.219726895913,118.425922412425,118.623819025233,118.801201758906,118.997462140396,119.18357164599,119.394188923761,119.607428628951,119.789487998933,119.999485854059} |   0.0060246 |                   |                        |   
(1 row)  
  
postgres=# select * from pg_stats where tablename='idx_test_3' ;  
 schemaname | tablename  | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs |                                                                                                                   
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                   histogram_bounds                                                                                                                                                                                                            
                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                               
                                                                                                                                                                        | correlation | most_common_elems | most_common_elem_freqs | elem_cou  
nt_histogram   
------------+------------+---------+-----------+-----------+-----------+------------+------------------+-------------------+-----------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+-------------------+------------------------+---------  
-------------  
 public     | idx_test_3 | st_y    | f         |         0 |         8 |         -1 |                  |                   | {65.0000545242801,65.0990079157054,65.2072919439524,65.301208673045,65.3929488640279,65.4877291806042,65.5846159  
020439,65.6880339607596,65.7856895681471,65.8818093780428,65.9888388821855,66.0896951518953,66.1984446132556,66.304198782891,66.3969474285841,66.496685273014,66.6056828852743,66.7012783605605,66.7941406555474,66.8967378046364,66.98895718  
90235,67.0855156239122,67.191197341308,67.2994636883959,67.3969272850081,67.4947946099564,67.5993846775964,67.7099830284715,67.8120812214911,67.9088703868911,67.9981751134619,68.099527056329,68.1973828841001,68.2960116351023,68.398535256  
274,68.4888928988948,68.5882083000615,68.6907122470438,68.800939405337,68.9010692993179,69.0081828692928,69.11032628268,69.19522870332,69.2957087606192,69.3950105085969,69.5086302934214,69.6040614042431,69.7045981185511,69.8040427314118,  
69.906362099573,70.0155263161287,70.1146272942424,70.2022811910138,70.2994426619262,70.4040612280369,70.5112449126318,70.6194959674031,70.722281485796,70.822925157845,70.918908463791,71.0138623649254,71.1134587181732,71.2168908445165,71.  
3034919975325,71.4052765490487,71.5016098646447,71.6014923620969,71.6971560986713,71.8042277451605,71.8970008520409,71.9877171749249,72.0928315026686,72.1963384188712,72.2967635607347,72.3947894759476,72.4977402808145,72.5967703945935,72  
.6985206454992,72.7883456554264,72.885032473132,72.9875696543604,73.0857636081055,73.1845043040812,73.280454329215,73.3783990284428,73.4819683339447,73.5874625109136,73.6941803013906,73.7924668611959,73.8872256595641,73.9888873603195,74.  
0806347271428,74.1819418873638,74.2877806117758,74.3848745618016,74.4918680656701,74.5988712925464,74.7014277381822,74.8038458405063,74.9038722459227,74.9999207118526} | -0.00307583 |                   |                        |   
(1 row)

现在我们改一下SQL，把boundbox放进SQL

select count(*) from test   
where   
st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326) , pos);

改成:

select count(*) from test   
where   
st_x(pos)   
  between  
    st_xmin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
  and  
    st_xmax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
and  
st_y(pos)   
  between  
    st_ymin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
  and  
    st_ymax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
and  
st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);

执行计划如下：

在大范围输入时，通过表达式索引条件评估得到的行数比空间编码大于空间评估（硬编码千分之一）得到的记录数，所以依旧使用了空间索引。

postgres=# explain select count(*) from test   
where   
st_x(pos)   
  between  
    st_xmin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
  and  
    st_xmax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
and  
st_y(pos)   
  between  
    st_ymin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
  and  
    st_ymax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326)))  
and  
st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(120,100)), 4326) , pos);  
                       QUERY PLAN                                                                                                                                        
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=12380.60..12380.61 rows=1 width=8)  
   ->  Index Scan using idx_test_1 on test  (cost=0.42..12376.42 rows=1674 width=0)  
         Index Cond: ('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos)  
         Filter: ((st_x(pos) >= '110'::double precision) AND (st_x(pos) <= '120'::double precision) AND (st_y(pos) >= '1'::double precision) AND (st_y(pos) <= '100'::double precision) AND _st_contains('0103000020E610000001000000050000000  000000000805B40000000000000F03F0000000000805B4000000000000059400000000000005E4000000000000059400000000000005E40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos))  
(4 rows)

或

在小范围输入时，评估得到的行数比空间编码更少(千分之一)，所以使用了非空间索引。

explain select count(*) from test   
where   
st_x(pos)   
  between  
    st_xmin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326)))  
  and  
    st_xmax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326)))  
and  
st_y(pos)   
  between  
    st_ymin(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326)))  
  and  
    st_ymax(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326)))  
and  
st_contains(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326) , pos);  
  
   QUERY PLAN                                
----------------------------------------------------------------------------------------------
 Aggregate  (cost=2.92..2.93 rows=1 width=8)  
   ->  Index Scan using idx_test_3 on test  (cost=0.43..2.92 rows=1 width=0)  
         Index Cond: ((st_y(pos) >= '1'::double precision) AND (st_y(pos) <= '2'::double precision))  
         Filter: (('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry ~ pos) AND   
(st_x(pos) >= '110'::double precision) AND (st_x(pos) <= '112'::double precision) AND _st_contains('0103000020E610000001000000050000000000000000805B40000000000000F03F0000000000805B4000000000000000400000000000005C4000000000000000400000000  
000005C40000000000000F03F0000000000805B40000000000000F03F'::geometry, pos))  
(4 rows)

一些函数

查看geometry的bound box

postgres=# select st_astext(box2d(st_setsrid(st_makebox2d(st_makepoint(110,1), st_makepoint(112,2)), 4326) ));  
                st_astext                   
------------------------------------------  
 POLYGON((110 1,110 2,112 2,112 1,110 1))  
(1 row)

postgres=# select st_astext(box2d(ST_GeomFromEWKT('CIRCULARSTRING(220268 150415 1,220227 150505 2,220227 150406 3)')));  
                                                                       st_astext                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------  
 POLYGON((220186.995121892 150406,220186.995121892 150506.126829327,220288.248780547 150506.126829327,220288.248780547 150406,220186.995121892 150406))  
(1 row)

2、查看BOUND BOX的边界点x,y.

postgres=# select st_xmin(box2d(ST_GeomFromEWKT('CIRCULARSTRING(220268 150415 1,220227 150505 2,220227 150406 3)')));  
     st_xmin        
------------------  
 220186.995121892  
(1 row)  
  
postgres=# select st_xmax(box2d(ST_GeomFromEWKT('CIRCULARSTRING(220268 150415 1,220227 150505 2,220227 150406 3)')));  
     st_xmax        
------------------  
 220288.248780547  
(1 row)  
  
postgres=# select st_ymin(box2d(ST_GeomFromEWKT('CIRCULARSTRING(220268 150415 1,220227 150505 2,220227 150406 3)')));  
 st_ymin   
---------  
  150406  
(1 row)  
  
postgres=# select st_ymax(box2d(ST_GeomFromEWKT('CIRCULARSTRING(220268 150415 1,220227 150505 2,220227 150406 3)')));  
     st_ymax        
------------------  
 150506.126829327  
(1 row)

小结

目前空间类型的bound box没有统计信息，所以空间查询时，选择性使用了硬编码写死，固定了选择性百分比，乘以pg_class.reltuples即可得到评估记录数，但是不准确。

使得PostgreSQL的空间搜索（包含搜索），不管范围多大，都使用索引，而实际上大范围可能使用全表扫描性能更好。

本文增加boundbox的条件，使用xyz边界条件，使得行数评估更加的准确，在适当的时候选择适合的索引。

当然，最好的方法，还是PG内核层面优化空间类型的统计信息，以及选择性代码的支持。

参考

建议，PostgreSQL内核层面改进空间类型的统计信息，点类型，分别统计X，Y轴的统计信息。

《PostgreSQL 自定义函数表达式选择性评估算法 - Statistics, Cardinality, Selectivity, Estimate》

PostgreSQL 空间类型统计信息(Statistics, Cardinality, Selectivity, Estimate)不准确导致SQL执行计划不准(包含、相交查询)的优化实践

标签

背景

例子

优化方法

现在我们改一下SQL，把boundbox放进SQL

一些函数

小结

参考

关系型数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像