PostgreSQL 流复制xlog异步send

本文涉及的产品
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介:

PostgreSQL 流复制xlog异步send

作者

digoal

日期

2016-11-07

标签

PostgreSQL , 同步流复制 , 异步send


背景

PostgreSQL的流复制相比大家并不陌生,但是目前PG为了保证主的高度统治地位,一切以主库为准。包括SEND WAL时,也要求主已经FLUSH才能发给备库。

这实际上会导致些许的延迟,当然这个延迟目前来看可以忽略不计,但是随着硬件的发展,将来这个模式可能就会不适应。

那么能不能让主库的WAL record已经调用write或者已经写入wal buffer就允许发给备库,实现一步的wal send呢。

当然是可以的,来看一下。

源码

GetFlushRecPtr()可以修改为write位置,或者Insert的位置,实现异步的send。

《PostgreSQL xlog的位置》

src/backend/replication/walsender.c

/*
 * Wait till WAL < loc is flushed to disk so it can be safely read.
 */
static XLogRecPtr
WalSndWaitForWal(XLogRecPtr loc)
{
        int                     wakeEvents;
        static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;


        /*
         * Fast path to avoid acquiring the spinlock in the we already know we
         * have enough WAL available. This is particularly interesting if we're
         * far behind.
         */
        if (RecentFlushPtr != InvalidXLogRecPtr &&
                loc <= RecentFlushPtr)
                return RecentFlushPtr;

        /* Get a more recent flush pointer. */
        if (!RecoveryInProgress())
                RecentFlushPtr = GetFlushRecPtr();  // 获取已flush位点
        else
                RecentFlushPtr = GetXLogReplayRecPtr(NULL);

        for (;;)
        {
                long            sleeptime;
                TimestampTz now;

                /*
                 * Emergency bailout if postmaster has died.  This is to avoid the
                 * necessity for manual cleanup of all postmaster children.
                 */
                if (!PostmasterIsAlive())
                        exit(1);

                /* Clear any already-pending wakeups */
                ResetLatch(MyLatch);

                CHECK_FOR_INTERRUPTS();

                /* Process any requests or signals received recently */
                if (got_SIGHUP)
                {
                        got_SIGHUP = false;
                        ProcessConfigFile(PGC_SIGHUP);
                        SyncRepInitConfig();
                }

                /* Check for input from the client */
                ProcessRepliesIfAny();

                /* Update our idea of the currently flushed position. */
                if (!RecoveryInProgress())
                        RecentFlushPtr = GetFlushRecPtr();  // 获取已flush位点
                else
                        RecentFlushPtr = GetXLogReplayRecPtr(NULL);

                /*
                 * If postmaster asked us to stop, don't wait here anymore. This will
                 * cause the xlogreader to return without reading a full record, which
                 * is the fastest way to reach the mainloop which then can quit.
                 *
                 * It's important to do this check after the recomputation of
                 * RecentFlushPtr, so we can send all remaining data before shutting
                 * down.
                 */
                if (walsender_ready_to_stop)
                        break;

                /*
                 * We only send regular messages to the client for full decoded
                 * transactions, but a synchronous replication and walsender shutdown
                 * possibly are waiting for a later location. So we send pings
                 * containing the flush location every now and then.
                 */
                if (MyWalSnd->flush < sentPtr &&
                        MyWalSnd->write < sentPtr &&
                        !waiting_for_ping_response)
                {
                        WalSndKeepalive(false);
                        waiting_for_ping_response = true;
                }

                /* check whether we're done */
                if (loc <= RecentFlushPtr)
                        break;

                /* Waiting for new WAL. Since we need to wait, we're now caught up. */
                WalSndCaughtUp = true;

                /*
                 * Try to flush pending output to the client. Also wait for the socket
                 * becoming writable, if there's still pending output after an attempt
                 * to flush. Otherwise we might just sit on output data while waiting
                 * for new WAL being generated.
                 */
                if (pq_flush_if_writable() != 0)
                        WalSndShutdown();

                now = GetCurrentTimestamp();

                /* die if timeout was reached */
                WalSndCheckTimeOut(now);

                /* Send keepalive if the time has come */
                WalSndKeepaliveIfNecessary(now);
                sleeptime = WalSndComputeSleeptime(now);

                wakeEvents = WL_LATCH_SET | WL_POSTMASTER_DEATH |
                        WL_SOCKET_READABLE | WL_TIMEOUT;

                if (pq_is_send_pending())
                        wakeEvents |= WL_SOCKET_WRITEABLE;

                /* Sleep until something happens or we time out */
                WaitLatchOrSocket(MyLatch, wakeEvents,
                                                  MyProcPort->sock, sleeptime);
        }

        /* reactivate latch so WalSndLoop knows to continue */
        SetLatch(MyLatch);
        return RecentFlushPtr;
}
static void
XLogSendPhysical(void)
{
......
        /* Figure out how far we can safely send the WAL. */
        if (sendTimeLineIsHistoric)
        {
......
        }
        else if (am_cascading_walsender)
        {
......
        }
        else
        {
                /*
                 * Streaming the current timeline on a master.
                 *
                 * Attempt to send all data that's already been written out and
                 * fsync'd to disk.  We cannot go further than what's been written out
                 * given the current implementation of XLogRead().  And in any case
                 * it's unsafe to send WAL that is not securely down to disk on the
                 * master: if the master subsequently crashes and restarts, slaves
                 * must not have applied any WAL that gets lost on the master.
                 */
                SendRqstPtr = GetFlushRecPtr(); 
        }

src/backend/access/transam/xlog.c

/*
 * Return the current Redo pointer from shared memory.
 *
 * As a side-effect, the local RedoRecPtr copy is updated.
 */
XLogRecPtr
GetRedoRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  ptr;

    /*
     * The possibly not up-to-date copy in XlogCtl is enough. Even if we
     * grabbed a WAL insertion lock to read the master copy, someone might
     * update it just after we've released the lock.
     */
    SpinLockAcquire(&xlogctl->info_lck);
    ptr = xlogctl->RedoRecPtr;
    SpinLockRelease(&xlogctl->info_lck);

    if (RedoRecPtr < ptr)
        RedoRecPtr = ptr;

    return RedoRecPtr;
}

/*
 * GetInsertRecPtr -- Returns the current insert position.
 *
 * NOTE: The value *actually* returned is the position of the last full
 * xlog page. It lags behind the real insert position by at most 1 page.
 * For that, we don't need to scan through WAL insertion locks, and an
 * approximation is enough for the current usage of this function.
 */
XLogRecPtr
GetInsertRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  recptr;

    SpinLockAcquire(&xlogctl->info_lck);
    recptr = xlogctl->LogwrtRqst.Write;
    SpinLockRelease(&xlogctl->info_lck);

    return recptr;
}

/*
 * GetFlushRecPtr -- Returns the current flush position, ie, the last WAL
 * position known to be fsync'd to disk.
 */
XLogRecPtr
GetFlushRecPtr(void)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    XLogRecPtr  recptr;

    SpinLockAcquire(&xlogctl->info_lck);
    recptr = xlogctl->LogwrtResult.Flush;
    SpinLockRelease(&xlogctl->info_lck);

    return recptr;
}
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
关系型数据库 PostgreSQL
PostgreSQL如何删除不使用的xlog文件
PostgreSQL如何删除不使用的xlog文件
153 0
|
Oracle 安全 关系型数据库
如何在openGauss/PostgreSQL手动清理XLOG/WAL 文件?
openGauss/PostgreSQL中的预写式日志WAL(Write Ahead Log),又名Xlog或redo log,相当于oracle的online redo log, 不同的是oracle online redo log是提前创建几组滚动使用,但在opengauss中只需要本配置参数控制WAL日志的周期,数据库会一直的创建并自动清理,但存在一些情况WAL日志未清理导致目录空间耗尽,或目录空间紧张时手动删除wal日志时,比如如何确认在非归档模式下哪些WAL日志文件可以安全删除?
957 0
|
6月前
|
SQL 关系型数据库 MySQL
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
452 1
|
SQL 存储 关系型数据库
PostgreSQL 流复制搭建主从环境,同步和异步的解释,压力测试,主从角色切换|学习笔记
快速学习PostgreSQL 流复制搭建主从环境,同步和异步的解释,压力测试,主从角色切换
PostgreSQL 流复制搭建主从环境,同步和异步的解释,压力测试,主从角色切换|学习笔记
|
SQL 存储 关系型数据库
9 PostgreSQL 点对点多主表级复制-触发器篇|学习笔记
快速学习9 PostgreSQL 点对点多主表级复制-触发器篇
9 PostgreSQL 点对点多主表级复制-触发器篇|学习笔记
|
存储 SQL Oracle
10 PostgreSQL 表级复制-物化视图篇, 支持异地,异构如 Oracle 到 pg 的物化视图|学习笔记
快速学习10 PostgreSQL 表级复制-物化视图篇,支持异地,异构如 Oracle 到 pg 的物化视图
10 PostgreSQL 表级复制-物化视图篇, 支持异地,异构如 Oracle 到 pg 的物化视图|学习笔记
|
存储 SQL 缓存
PostgreSQL 复制原理及高可用集群(一)|学习笔记
快速学习 PostgreSQL 复制原理及高可用集群(一)
408 0
PostgreSQL 复制原理及高可用集群(一)|学习笔记
|
SQL 消息中间件 算法
14 PostgreSQL 表级复制-Londiste3哈希数据分区复制|学习笔记
快速学习14 PostgreSQL 表级复制-Londiste3哈希数据分区复制
14 PostgreSQL 表级复制-Londiste3哈希数据分区复制|学习笔记
|
SQL 消息中间件 存储
PostgreSQL 表级复制-Londiste3安装以及使用|学习笔记
快速学习 PostgreSQL 表级复制-Londiste3安装以及使用
PostgreSQL 表级复制-Londiste3安装以及使用|学习笔记
|
关系型数据库 PostgreSQL
《PostgreSQL复制原理及高可用集群》电子版地址
PostgreSQL复制原理及高可用集群
101 0
《PostgreSQL复制原理及高可用集群》电子版地址

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版
  • 下一篇
    无影云桌面