PostgreSQL在何处处理 sql查询之四十三

本文涉及的产品
云原生数据库 PolarDB MySQL 版,通用型 2核8GB 50GB
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
简介:

再次上溯:可以知道,在 ExecutePlan入口参数里面,start_block 就已经是0了。

复制代码
/* ----------------------------------------------------------------
 *        ExecutePlan
 *
 *        Processes the query plan until we have processed 'numberTuples' tuples,
 *        moving in the specified direction.
 *
 *        Runs to completion if numberTuples is 0
 *
 * Note: the ctid attribute is a 'junk' attribute that is removed before the
 * user can see it
 * ----------------------------------------------------------------
 */
static void
ExecutePlan(EState *estate,
            PlanState *planstate,
            CmdType operation,
            bool sendTuples,
            long numberTuples,
            ScanDirection direction,
            DestReceiver *dest)
{
    TupleTableSlot *slot;
    long        current_tuple_count;

    /*
     * initialize local variables
     */
    current_tuple_count = 0;

    /*
     * Set the direction.
     */
    estate->es_direction = direction;

    /*
     * Loop until we've processed the proper number of tuples from the plan.
     */
    for (;;)
    {
        /* Reset the per-output-tuple exprcontext */
        ResetPerTupleExprContext(estate);

        //ExecProcNode
        /**
        fprintf(stderr,"ExecutePlan:node->ss_currentScanDesc->rs_startblock is: %d by process %d\n",
                ((SeqScanState *) planstate)->ss_currentScanDesc->rs_startblock,getpid());
        */
        //////added by gaojian --start

        SeqScanState *seq_state = ( SeqScanState *)planstate;

        HeapScanDesc heapdesc = seq_state->ss_currentScanDesc;

        BlockNumber bnum;

        if (heapdesc != NULL)
        {

            //fprintf(stderr,"heapdesc is not null\n");

            /**
            if (heapdesc->rs_startblock == NULL  )
                fprintf(stderr,"rs_startblock is NULL\n");
            else
                fprintf(stderr,"rs_startblock is not NULL\n");
            */

            bnum = heapdesc ->rs_startblock;
            //fprintf(stderr,"bnum is %d\n",bnum);

        }else{
            fprintf(stderr,"heapdesc is null\n");
        }

        //fprintf(stderr,"startblock is:%zu\n",seq_state->ss_currentScanDesc->rs_startblock);

        //fprintf(stderr, "%d \n",getpid());
        /////added by gaojian end


        /*
         * Execute the plan and obtain a tuple
         */
        slot = ExecProcNode(planstate);

        /*
         * if the tuple is null, then we assume there is nothing more to
         * process so we just end the loop...
         */
        if (TupIsNull(slot))
            break;

        /*
         * If we have a junk filter, then project a new tuple with the junk
         * removed.
         *
         * Store this new "clean" tuple in the junkfilter's resultSlot.
         * (Formerly, we stored it back over the "dirty" tuple, which is WRONG
         * because that tuple slot has the wrong descriptor.)
         */
        if (estate->es_junkFilter != NULL)
            slot = ExecFilterJunk(estate->es_junkFilter, slot);

        /*
         * If we are supposed to send the tuple somewhere, do so. (In
         * practice, this is probably always the case at this point.)
         */
        if (sendTuples)
            (*dest->receiveSlot) (slot, dest);

        /*
         * Count tuples processed, if this is a SELECT.  (For other operation
         * types, the ModifyTable plan node must count the appropriate
         * events.)
         */
        if (operation == CMD_SELECT)
            (estate->es_processed)++;

        /*
         * check our tuple count.. if we've processed the proper number then
         * quit, else loop again and process more tuples.  Zero numberTuples
         * means no limit.
         */
        current_tuple_count++;
        if (numberTuples && numberTuples == current_tuple_count)
            break;
    }
}
复制代码

再上溯:

 其 planstate 来自于  queryDesc->planstate。

可以这样认为, queryDesc->planstate 应该早已经初始化好了start_block。

复制代码
void
ExecutorRun(QueryDesc *queryDesc,
            ScanDirection direction, long count)
{
    if (ExecutorRun_hook)
        (*ExecutorRun_hook) (queryDesc, direction, count);
    else
        standard_ExecutorRun(queryDesc, direction, count);
}

void
standard_ExecutorRun(QueryDesc *queryDesc,
                     ScanDirection direction, long count)
{
    EState       *estate;
    CmdType        operation;
    DestReceiver *dest;
    bool        sendTuples;
    MemoryContext oldcontext;

    /* sanity checks */
    Assert(queryDesc != NULL);

    estate = queryDesc->estate;

    Assert(estate != NULL);
    Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));

    /*
     * Switch into per-query memory context
     */
    oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);

    /* Allow instrumentation of Executor overall runtime */
    if (queryDesc->totaltime)
        InstrStartNode(queryDesc->totaltime);

    /*
     * extract information from the query descriptor and the query feature.
     */
    operation = queryDesc->operation;
    dest = queryDesc->dest;

    /*
     * startup tuple receiver, if we will be emitting tuples
     */
    estate->es_processed = 0;
    estate->es_lastoid = InvalidOid;

    sendTuples = (operation == CMD_SELECT ||
                  queryDesc->plannedstmt->hasReturning);

    if (sendTuples)
        (*dest->rStartup) (dest, operation, queryDesc->tupDesc);

    /*
     * run plan
     */
    if (!ScanDirectionIsNoMovement(direction))
        ExecutePlan(estate,
                    queryDesc->planstate,
                    operation,
                    sendTuples,
                    count,
                    direction,
                    dest);

    /*
     * shutdown tuple receiver, if we started it
     */
    if (sendTuples)
        (*dest->rShutdown) (dest);

    if (queryDesc->totaltime)
        InstrStopNode(queryDesc->totaltime, estate->es_processed);

    MemoryContextSwitchTo(oldcontext);
}
复制代码

再次上溯:

复制代码
static long
PortalRunSelect(Portal portal,
                bool forward,
                long count,
                DestReceiver *dest)
{
    QueryDesc  *queryDesc;
    ScanDirection direction;
    uint32        nprocessed;

    /*
     * NB: queryDesc will be NULL if we are fetching from a held cursor or a
     * completed utility query; can't use it in that path.
     */
    queryDesc = PortalGetQueryDesc(portal);

       ...

    if (forward)
    {
        if (portal->atEnd || count <= 0)
            direction = NoMovementScanDirection;
        else
            direction = ForwardScanDirection;

        /* In the executor, zero count processes all rows */
        if (count == FETCH_ALL)
            count = 0;

        if (portal->holdStore)
            nprocessed = RunFromStore(portal, direction, count, dest);
        else
        {
            PushActiveSnapshot(queryDesc->snapshot);
            ExecutorRun(queryDesc, direction, count);
            nprocessed = queryDesc->estate->es_processed;
            PopActiveSnapshot();
        }
                ...
        }
        ...
}
复制代码

那么,portal 与 QueryDesc 又是什么关系呢?

看下面:就是说 Portal 指针 里保留着一个,指向 QueryDesc 的指针。

复制代码
typedef struct PortalData *Portal;

typedef struct PortalData
{
    /* Bookkeeping data */
    const char *name;            /* portal's name */
    const char *prepStmtName;    /* source prepared statement (NULL if none) */
    MemoryContext heap;            /* subsidiary memory for portal */
    ResourceOwner resowner;        /* resources owned by portal */
    void        (*cleanup) (Portal portal);        /* cleanup hook */
    SubTransactionId createSubid;        /* the ID of the creating subxact */

    /*
     * if createSubid is InvalidSubTransactionId, the portal is held over from
     * a previous transaction
     */

    /* The query or queries the portal will execute */
    const char *sourceText;        /* text of query (as of 8.4, never NULL) */
    const char *commandTag;        /* command tag for original query */
    List       *stmts;            /* PlannedStmts and/or utility statements */
    CachedPlan *cplan;            /* CachedPlan, if stmts are from one */

    ParamListInfo portalParams; /* params to pass to query */

    /* Features/options */
    PortalStrategy strategy;    /* see above */
    int            cursorOptions;    /* DECLARE CURSOR option bits */

    /* Status data */
    PortalStatus status;        /* see above */
    bool        portalPinned;    /* a pinned portal can't be dropped */

    /* If not NULL, Executor is active; call ExecutorEnd eventually: */
    QueryDesc  *queryDesc;        /* info needed for executor invocation */

    /* If portal returns tuples, this is their tupdesc: */
    TupleDesc    tupDesc;        /* descriptor for result tuples */
    /* and these are the format codes to use for the columns: */
    int16       *formats;        /* a format code for each column */

    /*
     * Where we store tuples for a held cursor or a PORTAL_ONE_RETURNING or
     * PORTAL_UTIL_SELECT query.  (A cursor held past the end of its
     * transaction no longer has any active executor state.)
     */
    Tuplestorestate *holdStore; /* store for holdable cursors */
    MemoryContext holdContext;    /* memory containing holdStore */

    /*
     * atStart, atEnd and portalPos indicate the current cursor position.
     * portalPos is zero before the first row, N after fetching N'th row of
     * query.  After we run off the end, portalPos = # of rows in query, and
     * atEnd is true.  If portalPos overflows, set posOverflow (this causes us
     * to stop relying on its value for navigation).  Note that atStart
     * implies portalPos == 0, but not the reverse (portalPos could have
     * overflowed).
     */
    bool        atStart;
    bool        atEnd;
    bool        posOverflow;
    long        portalPos;

    /* Presentation data, primarily used by the pg_cursors system view */
    TimestampTz creation_time;    /* time at which this portal was defined */
    bool        visible;        /* include this portal in pg_cursors? */
}    PortalData;
复制代码

再看 QueryDesc:

复制代码
typedef struct QueryDesc
{
    /* These fields are provided by CreateQueryDesc */
    CmdType        operation;        /* CMD_SELECT, CMD_UPDATE, etc. */
    PlannedStmt *plannedstmt;    /* planner's output, or null if utility */
    Node       *utilitystmt;    /* utility statement, or null */
    const char *sourceText;        /* source text of the query */
    Snapshot    snapshot;        /* snapshot to use for query */
    Snapshot    crosscheck_snapshot;    /* crosscheck for RI update/delete */
    DestReceiver *dest;            /* the destination for tuple output */
    ParamListInfo params;        /* param values being passed in */
    int            instrument_options;        /* OR of InstrumentOption flags */

    /* These fields are set by ExecutorStart */
    TupleDesc    tupDesc;        /* descriptor for result tuples */
    EState       *estate;            /* executor's query-wide state */
    PlanState  *planstate;        /* tree of per-plan-node state */

    /* This is always set NULL by the core system, but plugins can change it */
    struct Instrumentation *totaltime;    /* total time spent in ExecutorRun */
} QueryDesc;
复制代码

QueryDesc 中,有指向 PlanState 的指针 planstate。

再看 planstate:

复制代码
typedef struct PlanState
{
    NodeTag        type;

    Plan       *plan;            /* associated Plan node */

    EState       *state;            /* at execution time, states of individual
                                 * nodes point to one EState for the whole
                                 * top-level plan */

    Instrumentation *instrument;    /* Optional runtime stats for this node */

    /*
     * Common structural data for all Plan types.  These links to subsidiary
     * state trees parallel links in the associated plan tree (except for the
     * subPlan list, which does not exist in the plan tree).
     */
    List       *targetlist;        /* target list to be computed at this node */
    List       *qual;            /* implicitly-ANDed qual conditions */
    struct PlanState *lefttree; /* input plan tree(s) */
    struct PlanState *righttree;
    List       *initPlan;        /* Init SubPlanState nodes (un-correlated expr
                                 * subselects) */
    List       *subPlan;        /* SubPlanState nodes in my expressions */

    /*
     * State for management of parameter-change-driven rescanning
     */
    Bitmapset  *chgParam;        /* set of IDs of changed Params */

    /*
     * Other run-time state needed by most if not all node types.
     */
    TupleTableSlot *ps_ResultTupleSlot; /* slot for my result tuples */
    ExprContext *ps_ExprContext;    /* node's expression-evaluation context */
    ProjectionInfo *ps_ProjInfo;    /* info for doing tuple projection */
    bool        ps_TupFromTlist;/* state flag for processing set-valued
                                 * functions in targetlist */
} PlanState;
复制代码

这里, PlanState 相当于基类了。

复制代码
typedef struct ScanState
{
    PlanState    ps;                /* its first field is NodeTag */
    Relation    ss_currentRelation;
    HeapScanDesc ss_currentScanDesc;
    TupleTableSlot *ss_ScanTupleSlot;
} ScanState;

/*
 * SeqScan uses a bare ScanState as its state node, since it needs
 * no additional fields.
 */
typedef ScanState SeqScanState;
复制代码

...

再看下一层的结构: HeapScanDesc :

复制代码
typedef struct HeapScanDescData *HeapScanDesc;

typedef struct HeapScanDescData
{
    /* scan parameters */
    Relation    rs_rd;            /* heap relation descriptor */
    Snapshot    rs_snapshot;    /* snapshot to see */
    int            rs_nkeys;        /* number of scan keys */
    ScanKey        rs_key;            /* array of scan key descriptors */
    bool        rs_bitmapscan;    /* true if this is really a bitmap scan */
    bool        rs_pageatatime; /* verify visibility page-at-a-time? */
    bool        rs_allow_strat; /* allow or disallow use of access strategy */
    bool        rs_allow_sync;    /* allow or disallow use of syncscan */

    /* state set up at initscan time */
    BlockNumber rs_nblocks;        /* number of blocks to scan */
    BlockNumber rs_startblock;    /* block # to start at */
    BufferAccessStrategy rs_strategy;    /* access strategy for reads */
    bool        rs_syncscan;    /* report location to syncscan logic? */

    /* scan current state */
    bool        rs_inited;        /* false = scan not init'd yet */
    HeapTupleData rs_ctup;        /* current tuple in scan, if any */
    BlockNumber rs_cblock;        /* current block # in scan, if any */
    Buffer        rs_cbuf;        /* current buffer in scan, if any */
    /* NB: if rs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
    ItemPointerData rs_mctid;    /* marked scan position, if any */

    /* these fields only used in page-at-a-time mode and for bitmap scans */
    int            rs_cindex;        /* current tuple's index in vistuples */
    int            rs_mindex;        /* marked tuple's saved index */
    int            rs_ntuples;        /* number of visible tuples on page */
    OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];    /* their offsets */
}    HeapScanDescData;
复制代码

至少在  PortalRunSelect 函数中,装箱在此之前就已经完成。后期在 ExecutorRun等等内部就相当于就 PlanState 进行拆箱了。








本文转自健哥的数据花园博客园博客,原文链接:http://www.cnblogs.com/gaojian/archive/2013/06/03/3115150.html,如需转载请自行联系原作者

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
打赏
0
0
0
0
60
分享
相关文章
PostgreSQL窗口函数避坑指南:如何让复杂分析查询提速300%?
本文基于真实企业级案例,深入剖析PostgreSQL窗口函数的执行原理与性能陷阱,提供8大优化策略。通过定制索引、分区裁剪、内存调优及并行处理等手段,将分钟级查询压缩至秒级响应。结合CTE分阶段计算与物化视图技术,解决海量数据分析中的瓶颈问题。某金融客户实践表明,风险分析查询从47秒降至0.8秒,效率提升5800%。文章附带代码均在PostgreSQL 15中验证,助您高效优化SQL性能。
第三篇:高级 SQL 查询与多表操作
本文深入讲解高级SQL查询技巧,涵盖多表JOIN操作、聚合函数、分组查询、子查询及视图索引等内容。适合已掌握基础SQL的学习者,通过实例解析INNER/LEFT/RIGHT/FULL JOIN用法,以及COUNT/SUM/AVG等聚合函数的应用。同时探讨复杂WHERE条件、子查询嵌套,并介绍视图简化查询与索引优化性能的方法。最后提供实践建议与学习资源,助你提升SQL技能以应对实际数据处理需求。
198 1
【YashanDB知识库】字段加上索引后,SQL查询不到结果
【YashanDB知识库】字段加上索引后,SQL查询不到结果
OSS Select 加速查询:10GB CSV 文件秒级过滤的 SQL 语法优化技巧
OSS Select 可直接在对象存储上执行 SQL 过滤,跳过文件下载,仅返回所需数据,性能比传统 ECS 方案提升 10~100 倍。通过减少返回列、使用等值查询、避免复杂函数、分区剪枝及压缩优化等技巧,可大幅降低扫描与传输量,显著提升查询效率并降低成本。
CTE vs 子查询:深入拆解PostgreSQL复杂SQL的隐藏性能差异
本文深入探讨了PostgreSQL中CTE(公共表表达式)与子查询的选择对SQL性能的影响。通过分析两者底层机制,揭示CTE的物化特性及子查询的优化融合优势,并结合多场景案例对比执行效率。最终给出决策指南,帮助开发者根据数据量、引用次数和复杂度选择最优方案,同时提供高级优化技巧和版本演进建议,助力SQL性能调优。
144 1
凌晨2点报警群炸了:一条sql 执行200秒!搞定之后,我总结了一个慢SQL查询、定位分析解决的完整套路
凌晨2点报警群炸了:一条sql 执行200秒!搞定之后,我总结了一个慢SQL查询、定位分析解决的完整套路
凌晨2点报警群炸了:一条sql 执行200秒!搞定之后,我总结了一个慢SQL查询、定位分析解决的完整套路
OmniSQL:开源文本到SQL神器!自然语言秒转查询到复杂多表连接等SQL需求
OmniSQL是开源的文本到SQL转换模型,通过创新的数据合成框架生成250万条高质量样本,支持7B/14B/32B三种模型版本,能处理从简单查询到复杂多表连接等各种SQL需求。
454 16
OmniSQL:开源文本到SQL神器!自然语言秒转查询到复杂多表连接等SQL需求
如何优化SQL查询以提高数据库性能?
这篇文章以生动的比喻介绍了优化SQL查询的重要性及方法。它首先将未优化的SQL查询比作在自助餐厅贪多嚼不烂的行为,强调了只获取必要数据的必要性。接着,文章详细讲解了四种优化策略:**精简选择**(避免使用`SELECT *`)、**专业筛选**(利用`WHERE`缩小范围)、**高效联接**(索引和限制数据量)以及**使用索引**(加速搜索)。此外,还探讨了如何避免N+1查询问题、使用分页限制结果、理解执行计划以及定期维护数据库健康。通过这些技巧,可以显著提升数据库性能,让查询更高效流畅。
SQL为什么不建议执行多表关联查询
本文探讨了SQL中不建议执行多表关联查询的原因,特别是MySQL与PG在多表关联上的区别。MySQL仅支持嵌套循环连接,而不支持排序-合并连接和散列连接,因此在多表(超过3张)关联查询时效率较低。文章还分析了多表关联查询与多次单表查询的效率对比,指出将关联操作放在Service层处理的优势,包括减少数据库计算资源消耗、提高缓存效率、降低锁竞争以及更易于分布式扩展等。最后,通过实例展示了如何分解关联查询以优化性能。
167 0

推荐镜像

更多
AI助理

你好,我是AI助理

可以解答问题、推荐解决方案等

登录插画

登录以查看您的控制台资源

管理云资源
状态一览
快捷访问