PostgreSQL在何处处理 sql查询之二十七-阿里云开发者社区

PostgreSQL在何处处理 sql查询之二十七

2017-12-19 1195

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云原生数据库 PolarDB MySQL 版，通用型 2核8GB 50GB

云原生数据库 PolarDB PostgreSQL 版，标准版 2核4GB 50GB

简介：

接前面，仔细看这个：这 add_base_rels_to_query 是个递归调用嘛。

想像一下： select * from tst01 where id in (select sid from tst02) or id in (select sid from tst03) 之类的，此函数将层层深入，构造一个二叉树式样的语法树。

void
add_base_rels_to_query(PlannerInfo *root, Node *jtnode)
{
    if (jtnode == NULL)
        return;
    if (IsA(jtnode, RangeTblRef))
    {
        int    varno = ((RangeTblRef *) jtnode)->rtindex;
        (void) build_simple_rel(root, varno, RELOPT_BASEREL);
    }
    else if (IsA(jtnode, FromExpr))
    {
        FromExpr   *f = (FromExpr *) jtnode;
        ListCell   *l;

        foreach(l, f->fromlist)
            add_base_rels_to_query(root, lfirst(l));
    }
    else if (IsA(jtnode, JoinExpr))
    {
        JoinExpr   *j = (JoinExpr *) jtnode;

        add_base_rels_to_query(root, j->larg);
        add_base_rels_to_query(root, j->rarg);
    }
    else
        elog(ERROR, "unrecognized node type: %d",
             (int) nodeTag(jtnode));
}

这个时候，我想到一个问题，在PostgreSQL中，有没有并行查询的可能呢？

那么，再上溯到更高的层面，再次梳理一下：

exec_simple_query中调用了 plan_tree_list，从语法分析树得到了计划树...执行 PortalRun。

而上面的 add_base_rels_to_query 是plan_tree_list 里的一部分。

调用关系如下：

plan_tree_list -->pg_plan_queries-->pg_plan_query-->planner

planner-->subquery_planner-->groupingplanner

groupingplanner-->query_planner-->add_base_rels_to_query

/*
  * exec_simple_query
  *
  * Execute a "simple Query" protocol message.
  */
 static void
 exec_simple_query(const char *query_string)
 {
     ...
     isTopLevel = (list_length(parsetree_list) == 1);
     ...
     foreach(parsetree_item, parsetree_list)
     {
         ...
         querytree_list = pg_analyze_and_rewrite(parsetree, query_string,
                                                 NULL, 0);
 
         plantree_list = pg_plan_queries(querytree_list, 0, NULL);
         ...
 
         /*
          * Create unnamed portal to run the query or queries in. If there
          * already is one, silently drop it.
          */
         portal = CreatePortal("", true, true);
         /* Don't display the portal in pg_cursors */
         portal->visible = false;
 
         /*
          * We don't have to copy anything into the portal, because everything
          * we are passing here is in MessageContext, which will outlive the
          * portal anyway.
          */
         PortalDefineQuery(portal,
                           NULL,
                           query_string,
                           commandTag,
                           plantree_list,
                           NULL);
 
         /*
          * Start the portal.
          *
          * If we took a snapshot for parsing/planning, the portal may be able
          * to reuse it for the execution phase.  Currently, this will only
          * happen in PORTAL_ONE_SELECT mode.  But even if PortalStart doesn't
          * end up being able to do this, keeping the parse/plan snapshot
          * around until after we start the portal doesn't cost much.
          */
         PortalStart(portal, NULL, 0, snapshot_set);
         ...
 
         /*
          * Run the portal to completion, and then drop it (and the receiver).
          */
         (void) PortalRun(portal,
                          FETCH_ALL,
                          isTopLevel,
                          receiver,
                          receiver,
                          completionTag);
 
         (*receiver->rDestroy) (receiver);
 
         PortalDrop(portal, false);
 
         ...
     }                            /* end loop over parsetrees */
 
     /*
      * Close down transaction statement, if one is open.
      */
     finish_xact_command();
     ...
 }

加入调试信息，看看一个带子查询的SQL，能否有并行查询的可能。

[postgres@lex pgsql]$ ./bin/psql
psql (9.2.1)
Type "help" for help.

postgres=# select * from tst01 where id IN (select sid from tst02) or id IN (select sid from tst03);
 id  
-----
 100
 200
(2 rows)

postgres=#

看后台的信息，知道planner调用了三次，PortalRun只调用了一次。

[postgres@lex pgsql]$ ./bin/pg_ctl -D ./data start
server starting
[postgres@lex pgsql]$ LOG:  database system was shut down at 2013-05-29 10:12:02 CST
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16387 ...by process 11796
In query_planner.... by process 11796
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16390 ...by process 11796
In query_planner.... by process 11796
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16384 ...by process 11796
In query_planner.... by process 11796
In PortalRun ... by 11796

也就是说，PostgreSQL 没有子查询的并行查询能力了？！

至少目前，它还非常地不完整：

http://postgresql.1045698.n5.nabble.com/Parallel-query-execution-td5740431.html

http://wiki.postgresql.org/wiki/Parallel_Query_Execution

各种集群方案也作不到这点：

http://www.postgresql.org/docs/current/static/different-replication-solutions.html

Many of the above solutions allow multiple servers to handle multiple queries, but none allow a single query to use multiple servers to complete faster.

对比下Oracle的：

http://miracle.blog.51cto.com/255044/147058/

本文转自健哥的数据花园博客园博客，原文链接：http://www.cnblogs.com/gaojian/archive/2013/05/29/3105388.html，如需转载请自行联系原作者

PostgreSQL在何处处理 sql查询之二十七

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

PostgreSQL在何处处理 sql查询之二十七

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像