hint bits 源码讲解, 为什么PostgreSQL查询语句也可能产生 xlog, 并且可能对buffer有write操作 ?-阿里云开发者社区

开发者社区> 德哥> 正文

hint bits 源码讲解, 为什么PostgreSQL查询语句也可能产生 xlog, 并且可能对buffer有write操作 ?

简介: 本文还可以回答另一个问题,为什么数据库开启wal_hint_log后或者使用initdb -k参数开启checksum后,产生的XLOG变多了。 PostgreSQL 查询是可能产生XLOG的,这需要从PostgreSQL tuple上的hint bits说起。什么是hint bits?你可以
+关注继续查看

本文还可以回答另外2个问题,
.1. 为什么数据库开启wal_hint_log后或者使用initdb -k参数开启checksum后,产生的XLOG变多了。
.2. 为什么使用pg_rewind修复时间线分歧,需要开启wal_hint_log或者checksum?

pg_rewind.c

static void
sanityChecks(void)
{
...
        /*
         * Target cluster need to use checksums or hint bit wal-logging, this to
         * prevent from data corruption that could occur because of hint bits.
         */
        if (ControlFile_target.data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
                !ControlFile_target.wal_log_hints)
        {
                pg_fatal("target server needs to use either data checksums or \"wal_log_hints = on\"\n");
        }
...

PostgreSQL 查询是可能产生XLOG的,这需要从PostgreSQL tuple上的hint bits说起。
什么是hint bits?你可以参考这个页面:
https://wiki.postgresql.org/wiki/Hint_Bits
hint bits是tuple头部的infomask里的2个BIT。用来表示该tuple的事务状态。

src/include/access/htup_details.h  
#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed 256 */  
#define HEAP_XMIN_INVALID               0x0200  /* t_xmin invalid/aborted 512  */  
#define HEAP_XMAX_COMMITTED             0x0400  /* t_xmax committed 1024  */  
#define HEAP_XMAX_INVALID               0x0800  /* t_xmax invalid/aborted 2048 */  

hint bits含义:
如果XMIN对应的BIT都没有设置,有两种可能,事务未结束,或者事务已结束,但是BIT还未被第一个检查该TUPLE HINT BIT的会话设置。
If neither of the XMIN bits is set, then either:
.1. The creating transaction is still in progress, which you can check by examining the list of running transactions in shared memory;
.2. You are the first one to check since it ended, in which case you need to consult pg_clog to know the transaction's status, and you can update the hint bits if you find out its final state.
XMAX表示被删除的记录,原理一样。
If the tuple has been marked deleted, then similar remarks apply to the XMAX bits.
假设没有hint bits,我们只能从pg_clog中或者PGXACT内存结构中(未结束的或未清除的事务信息内存)得知该tuple对应的事务提交状态,显然如果每条tuple都要查询pg_clog的话,性能一定会很差。
所以为了提升性能,PostgreSQL在tuple的头部t_infomask中通过4个比特位来存储事务的提交状态。从而我们不需要查询pg_clog来获得事务信息。
但是请注意,并不是在事务结束时设置t_infomask的hint bits。而是在后面的DML或者DQL,VACUUM等SQL扫描到对应的TUPLE时,触发SET BITS的操作。
设置hint bits的代码如下:

SetHintBits@src/backend/utils/time/tqual.c  
src/backend/utils/time/tqual.c  
 * tqual.c  
 *        POSTGRES "time qualification" code, ie, tuple visibility rules.  
 *  
 * NOTE: all the HeapTupleSatisfies routines will update the tuple's  
 * "hint" status bits if we see that the inserting or deleting transaction  
 * has now committed or aborted (and it is safe to set the hint bits).  
 * If the hint bits are changed, MarkBufferDirtyHint is called on  
 * the passed-in buffer.  The caller must hold not only a pin, but at least  
 * shared buffer content lock on the buffer containing the tuple.  
 *  
 * NOTE: must check TransactionIdIsInProgress (which looks in PGXACT array)  
 * before TransactionIdDidCommit/TransactionIdDidAbort (which look in  
 * pg_clog).  Otherwise we have a race condition: we might decide that a  
 * just-committed transaction crashed, because none of the tests succeed.  
 * xact.c is careful to record commit/abort in pg_clog before it unsets  
 * MyPgXact->xid in PGXACT array.  That fixes that problem, but it also  
 * means there is a window where TransactionIdIsInProgress and  
 * TransactionIdDidCommit will both return true.  If we check only  
 * TransactionIdDidCommit, we could consider a tuple committed when a  
 * later GetSnapshotData call will still think the originating transaction  
 * is in progress, which leads to application-level inconsistency.  The  
 * upshot is that we gotta check TransactionIdIsInProgress first in all  
 * code paths, except for a few cases where we are looking at  
 * subtransactions of our own main transaction and so there can't be any  
 * race condition.  
 *  
 * Summary of visibility functions:  
 *  
 *       HeapTupleSatisfiesMVCC()  
 *                visible to supplied snapshot, excludes current command  
 *       HeapTupleSatisfiesUpdate()  
 *                visible to instant snapshot, with user-supplied command  
 *                counter and more complex result  
 *       HeapTupleSatisfiesSelf()  
 *                visible to instant snapshot and current command  
 *       HeapTupleSatisfiesDirty()  
 *                like HeapTupleSatisfiesSelf(), but includes open transactions  
 *       HeapTupleSatisfiesVacuum()  
 *                visible to any running transaction, used by VACUUM  
 *       HeapTupleSatisfiesToast()  
 *                visible unless part of interrupted vacuum, used for TOAST  
 *       HeapTupleSatisfiesAny()  
 *                all tuples are visible  
 *  
 * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group  
 * Portions Copyright (c) 1994, Regents of the University of California  
 *  
 * IDENTIFICATION  
 *        src/backend/utils/time/tqual.c  
......  
/*  
 * SetHintBits()  
 *  
 * Set commit/abort hint bits on a tuple, if appropriate at this time.  
 *  
 * It is only safe to set a transaction-committed hint bit if we know the  
 * transaction's commit record has been flushed to disk, or if the table is  
 * temporary or unlogged and will be obliterated by a crash anyway.  We  
 * cannot change the LSN of the page here because we may hold only a share  
 * lock on the buffer, so we can't use the LSN to interlock this; we have to  
 * just refrain from setting the hint bit until some future re-examination  
 * of the tuple.  
 *  
 * We can always set hint bits when marking a transaction aborted.  (Some  
 * code in heapam.c relies on that!)  
 *  
 * Also, if we are cleaning up HEAP_MOVED_IN or HEAP_MOVED_OFF entries, then  
 * we can always set the hint bits, since pre-9.0 VACUUM FULL always used  
 * synchronous commits and didn't move tuples that weren't previously  
 * hinted.  (This is not known by this subroutine, but is applied by its  
 * callers.)  Note: old-style VACUUM FULL is gone, but we have to keep this  
 * module's support for MOVED_OFF/MOVED_IN flag bits for as long as we  
 * support in-place update from pre-9.0 databases.  
 *  
 * Normal commits may be asynchronous, so for those we need to get the LSN  
 * of the transaction and then check whether this is flushed.  
 *  
 * The caller should pass xid as the XID of the transaction to check, or  
 * InvalidTransactionId if no check is needed.  
 */  
static inline void  
SetHintBits(HeapTupleHeader tuple, Buffer buffer,  
                        uint16 infomask, TransactionId xid)  
{  
        if (TransactionIdIsValid(xid))  
        {  
                /* NB: xid must be known committed here! */  
                XLogRecPtr      commitLSN = TransactionIdGetCommitLSN(xid);  // 获取事务对应的commitLSN  
  
                if (XLogNeedsFlush(commitLSN) && BufferIsPermanent(buffer))  // 在设置hint bits前,必须确保事务对应的xlog 已经flush到磁盘,否则可能出现不一致的情况。例如数据恢复时xlog没有,但是CLOG显示已提交。  
                        return;                         /* not flushed yet, so don't set hint */  
        }  
  
        tuple->t_infomask |= infomask;  // 设置hint bits  
        MarkBufferDirtyHint(buffer, true);  // 将buffer标记为dirty,当initdb 打开了checksum或者使用了wal_log_hints时,如果它刚好是checkpoint后的第一个脏页,则写full page到WAL。  
}  

我们可以来验证一下:

> truncate t;  
postgres=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
           5497  
(1 row)  

验证用的stap跟踪脚本 :

[root@digoal ~]# cat trc.stp   
global f_start[999999]  
  
probe process("/opt/pgsql/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c").call {   
   f_start[execname(), pid(), tid(), cpu()] = gettimeofday_ms()  
   printf("%s <- time:%d, pp:%s, par:%s\n", thread_indent(-1), gettimeofday_ms(), pp(), $$parms$$)  
   # printf("%s -> time:%d, pp:%s\n", thread_indent(1), f_start[execname(), pid(), tid(), cpu()], pp() )  
}  
  
probe process("/opt/pgsql/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c").return {   
  t=gettimeofday_ms()  
  a=execname()  
  b=cpu()  
  c=pid()  
  d=pp()  
  e=tid()  
  if (f_start[a,c,e,b]) {  
  #  printf("%s <- time:%d, pp:%s, par:%s\n", thread_indent(-1), t - f_start[a,c,e,b], d, $$params$$)  
  printf("%s <- time:%d, pp:%s\n", thread_indent(-1), t - f_start[a,c,e,b], d)  
  }  
}  
  
[root@digoal ~]# stap -vp 5 -DMAXSKIPPED=9999999 -DSTP_NO_OVERLOAD -DMAXTRYLOCK=100 ./trc.stp -x 5497  
  
postgres=# insert into t values (1);  
INSERT 0 1  

insert事务结束后,没有跟踪到SetHintBits,因为事务结束时不设置hint bits。

postgres=# select * from t;  
 id   
----  
  1  
(1 row)  

执行select时跟踪到了sethintbits,参数infomask=256,即HEAP_XMIN_COMMITTED 。

71259406 postgres(5497): <- time:1441448520839, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734170, .t_xmax=0, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734170, .datum_typmod=0, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=1}, .t_infomask2=1, .t_infomask=2048, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=256 xid=390734170  
71259458 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  

再次插入一条记录

postgres=# insert into t values (2);  
INSERT 0 1  

无SetHintBits
更新2条记录,你会发现有一次SetHintBits

postgres=# update t set id=3;  
UPDATE 2  

这次SetHintBits实际上对应的是values(2)的那条记录,而values(1)那条已经在前面的select *被调用了SetHintBits。

5356459357 postgres(5497): <- time:1441453806039, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734178, .t_xmax=0, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734178, .datum_typmod=0, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=2}, .t_infomask2=1, .t_infomask=2048, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=256 xid=390734178  
5356459410 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  

那么现在实际上有4条记录,2条是老的版本,2条是新的版本。
再次执行select,全表扫描,会扫到所有的tuple,而且这4条TUPLE都是在最后一次事务后没有被设置SetHintBits的。所以会产生4次SetHintBits

postgres=# select * from t;  
 id   
----  
  3  
  3  
(2 rows)  
5464475078 postgres(5497): <- time:1441453914055, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734177, .t_xmax=390734179, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734177, .datum_typmod=390734179, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=3}, .t_infomask2=16385, .t_infomask=256, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=1024 xid=390734179  
5464475132 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  
5464475156 postgres(5497): <- time:1441453914055, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734178, .t_xmax=390734179, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734178, .datum_typmod=390734179, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=4}, .t_infomask2=16385, .t_infomask=256, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=1024 xid=390734179  
5464475190 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  

前2条SetHintBits的infomask参数=1024对应HEAP_XMAX_COMMITTED,对应的是老的版本。

5464475210 postgres(5497): <- time:1441453914055, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734179, .t_xmax=0, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734179, .datum_typmod=0, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=3}, .t_infomask2=32769, .t_infomask=10240, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=256 xid=390734179  
5464475243 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  
5464475263 postgres(5497): <- time:1441453914055, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").call, par:tuple={.t_choice={.t_heap={.t_xmin=390734179, .t_xmax=0, .t_field3={.t_cid=0, .t_xvac=0}}, .t_datum={.datum_len_=390734179, .datum_typmod=0, .datum_typeid=0}}, .t_ctid={.ip_blkid={.bi_hi=0, .bi_lo=0}, .ip_posid=4}, .t_infomask2=32769, .t_infomask=10240, .t_hoff='\030', .t_bits=""} buffer=3657 infomask=256 xid=390734179  
5464475294 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("SetHintBits@/opt/soft_bak/postgresql-9.4.4/src/backend/utils/time/tqual.c:110").return  

后2条SetHintBits的infomask参数=256对应HEAP_XMIN_COMMITTED,对应的是新的版本。

接下来我们看看SetHintBits是不是产生了xlog,需要关注SetHintBits里的这个调用:

MarkBufferDirtyHint(buffer, true)@src/backend/storage/buffer/bufmgr.c  
/*  
 * MarkBufferDirtyHint  
 *  
 *    Mark a buffer dirty for non-critical changes.  
 *  
 * This is essentially the same as MarkBufferDirty, except:  
 *  
 * 1. The caller does not write WAL; so if checksums are enabled, we may need  
 *      to write an XLOG_HINT WAL record to protect against torn pages.  
 * 2. The caller might have only share-lock instead of exclusive-lock on the  
 *      buffer's content lock.  
 * 3. This function does not guarantee that the buffer is always marked dirty  
 *      (due to a race condition), so it cannot be used for important changes.  
 */  
void  
MarkBufferDirtyHint(Buffer buffer, bool buffer_std)  
......  

仅仅当设置了wal_log_hints或初始化数据库initdb -k打开了checksum时,
它会调用XLogSaveBufferForHint来写WAL,当然不是所有的set hint都会写WAL,只有当checkpoint后,如果这个数据块是第一次被修改才需要写。

        if (XLogHintBitIsNeeded() && (bufHdr->flags & BM_PERMANENT))  
        {  
...  
            MyPgXact->delayChkpt = delayChkpt = true;  
            lsn = XLogSaveBufferForHint(buffer, buffer_std);  
......  

什么情况下的set hint bits操作需要写WAL?

src/include/access/xlog.h  
#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)  
src/backend/access/transam/xlog.c  
/*  
 * Are checksums enabled for data pages?  
 */  
bool  
DataChecksumsEnabled(void)  
{  
    Assert(ControlFile != NULL);  
    return (ControlFile->data_checksum_version > 0);  
}  
......  
XLogSaveBufferForHint(Buffer buffer, bool buffer_std)@src/backend/access/transam/xlog.c  
/*  
 * Write a backup block if needed when we are setting a hint. Note that  
 * this may be called for a variety of page types, not just heaps.  
 *  
 * Callable while holding just share lock on the buffer content.  
 *  
 * We can't use the plain backup block mechanism since that relies on the  
 * Buffer being exclusively locked. Since some modifications (setting LSN, hint  
 * bits) are allowed in a sharelocked buffer that can lead to wal checksum  
 * failures. So instead we copy the page and insert the copied data as normal  
 * record data.  
 *  
 * We only need to do something if page has not yet been full page written in  
 * this checkpoint round. The LSN of the inserted wal record is returned if we  
 * had to write, InvalidXLogRecPtr otherwise.  
 *  
 * It is possible that multiple concurrent backends could attempt to write WAL  
 * records. In that case, multiple copies of the same block would be recorded  
 * in separate WAL records by different backends, though that is still OK from  
 * a correctness perspective.  // 可能写多次哦  
 */  
XLogRecPtr  
XLogSaveBufferForHint(Buffer buffer, bool buffer_std)  
{  

通过XLogCheckBuffer来判断是否需要写XLOG。

/*  
 * Determine whether the buffer referenced by an XLogRecData item has to  
 * be backed up, and if so fill a BkpBlock struct for it.  In any case  
 * save the buffer's LSN at *lsn.  
 */  
static bool  
XLogCheckBuffer(XLogRecData *rdata, bool holdsExclusiveLock,  
                                XLogRecPtr *lsn, BkpBlock *bkpb)  
{  

验证:

[root@digoal ~]# cat trc.stp   
global f_start[999999]  
  
probe process("/opt/pgsql/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c").call {   
   f_start[execname(), pid(), tid(), cpu()] = gettimeofday_ms()  
   # printf("%s <- time:%d, pp:%s, par:%s\n", thread_indent(-1), gettimeofday_ms(), pp(), $$parms$$)  
   printf("%s -> time:%d, pp:%s\n", thread_indent(1), f_start[execname(), pid(), tid(), cpu()], pp() )  
}  
  
  
  
probe process("/opt/pgsql/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c").return {   
  t=gettimeofday_ms()  
  a=execname()  
  b=cpu()  
  c=pid()  
  d=pp()  
  e=tid()  
  if (f_start[a,c,e,b]) {  
  printf("%s <- time:%d, pp:%s, par:%s\n", thread_indent(-1), t - f_start[a,c,e,b], d, $return$$)  
  # printf("%s <- time:%d, pp:%s\n", thread_indent(-1), t - f_start[a,c,e,b], d)  
  }  
}  

观察检查点后的没有发生full page write的数据块上的sethintbits操作:

postgres=# update t set id=4;  
UPDATE 2  
postgres=# checkpoint;  
CHECKPOINT  
postgres=# select * from t;  
 id   
----  
  4  
  4  
(2 rows)  

跟踪到XLogCheckBuffer的返回结果为true,因为这个块是在checkpoint后第一次被修改,所以还没有FPW,因此这次SETHINTBITs需要写XLOG。

     0 postgres(5497): -> time:1441457600685, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").call  
    30 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").return, par:'\001'  

再次更新

postgres=# update t set id=5;  
UPDATE 2  

跟踪到XLogCheckBuffer返回FALSE,因为这个数据块在CHECKPOINT后,已经调用了FPW,所以这次SETHINTBIT不需要写XLOG。

     0 postgres(5497): -> time:1441457627431, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").call  
    27 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").return, par:'\000'  
     0 postgres(5497): -> time:1441457627431, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").call  
    20 postgres(5497): <- time:0, pp:process("/opt/pgsql9.4.4/bin/postgres").function("XLogCheckBuffer@/opt/soft_bak/postgresql-9.4.4/src/backend/access/transam/xlog.c:2031").return, par:'\000'  

因此我们可以认为,当发生SetHintBits时,并不是每次都需要写XLOG,仅仅当开启了wal_log_hints或者initdb时开启了checksum,并且这个被sethintbits的数据块是在checkpoint后第一次被修改(改 bit也算一种修改。),这种情况下才需要写full page到WAL。其他情况不会写WAL。
同时,还有一种可能性,因为SetHintBits是针对单条TUPLE的,所以当有并行的会话在对一个PAGE的多个TUPLE进行SetHintBits操作时,可能导致这个PAGE在多次checkpoint时被写多次到WAL。或者在2个checkpoint之间,多次被bgwriter刷到OS dirty page,可能造成多次OS IO。

用gdb可以跟踪到XLogSaveBufferForHint 调用recptr = XLogInsert(RM_XLOG_ID, XLOG_FPI, rdata)来写XLOG。

其他例子,当发生行级别锁冲突时,等待锁的会话在获得锁后,也会对修改的TUPLE设置HINT BITS。
例如两个update语句更新同一条记录时,后获得锁的事务会在先获得锁的事务提交后,对它所修改的记录设置hint bits。

/*  
 * UpdateXmaxHintBits - update tuple hint bits after xmax transaction ends  
 *  
 * This is called after we have waited for the XMAX transaction to terminate.  
 * If the transaction aborted, we guarantee the XMAX_INVALID hint bit will  
 * be set on exit.  If the transaction committed, we set the XMAX_COMMITTED  
 * hint bit if possible --- but beware that that may not yet be possible,  
 * if the transaction committed asynchronously.  
 *  
 * Note that if the transaction was a locker only, we set HEAP_XMAX_INVALID  
 * even if it commits.  
 *  
 * Hence callers should look only at XMAX_INVALID.  
 *  
 * Note this is not allowed for tuples whose xmax is a multixact.  
 */  
static void  
UpdateXmaxHintBits(HeapTupleHeader tuple, Buffer buffer, TransactionId xid)  
{  
        Assert(TransactionIdEquals(HeapTupleHeaderGetRawXmax(tuple), xid));  
        Assert(!(tuple->t_infomask & HEAP_XMAX_IS_MULTI));  
  
        if (!(tuple->t_infomask & (HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID)))  
        {  
                if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&  
                        TransactionIdDidCommit(xid))  
                        HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,  
                                                                 xid);  
                else  
                        HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,  
                                                                 InvalidTransactionId);  
        }  
}  

[参考]
https://wiki.postgresql.org/wiki/Hint_Bits
src/include/access/htup_details.h
src/backend/utils/time/tqual.c
src/backend/storage/buffer/bufmgr.c
src/include/access/xlog.h
src/backend/access/transam/xlog.c

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
手写SpringMVC实战,一切从Spring底层源码分析开始
Spring框架对于Java后端程序员来说再熟悉不过了,以前只知道它用的反射实现的,但了解之后才知道有很多巧妙的设计在里面。
1184 0
hbase源码系列(十五)终结篇&Scan续集-->如何查询出来下一个KeyValue
这是这个系列的最后一篇了,实在没精力写了,本来还想写一下hbck的,这个东西很常用,当hbase的Meta表出现错误的时候,它能够帮助我们进行修复,无奈看到3000多行的代码时,退却了,原谅我这点自私的想法吧。
3112 0
怎么设置阿里云服务器安全组?阿里云安全组规则详细解说
阿里云服务器安全组设置规则分享,阿里云服务器安全组如何放行端口设置教程
7438 0
Mysql中查询系统时间
  方法一 select current_date mysql&gt; select current_date as Systemtime; +------------+ | Systemtime | +------------+ | 2009-07-29 | +------------+     方法二 select now() mysql&gt; select n
790 0
一条SQL完成跨数据库实例Join查询
本文以MySQL及Redis为例,介绍如何通过一条SQL轻松完成跨数据库实例的查询。
3102 0
Microsoft Visual Studio与Firefly 一直提示加载项目,更新源码状态问题
        笔记本一开始安装的是vs2010,由于近期开发要用vs2008与vs2005于是今天又把2008、2005安装上了,但在打开项目的时候,先是提示加载项目文件,然后一直提示更新源码状态,很慢很慢的,之前只有vs2010的时候,打开是很快的,现在不管是用2008、2005、2010就没有一个快的,源码管理用的是firefly,有人知道为什么会出现这种情况吗?        
991 0
+关注
德哥
公益是一辈子的事, I&#39;m digoal, just do it.
2153
文章
245
问答
来源圈子
更多
阿里云数据库:帮用户承担一切数据库风险,给您何止是安心!支持关系型数据库:MySQL、SQL Server、PostgreSQL、PPAS(完美兼容Oracle)、自研PB级数据存储的分布式数据库Petadata、自研金融级云数据库OceanBase支持NoSQL数据库:MongoDB、Redis、Memcache更有褚霸、丁奇、德哥、彭立勋、玄惭、叶翔等顶尖数据库专家服务。
+ 订阅
文章排行榜
最热
最新
相关电子书
更多
文娱运维技术
立即下载
《SaaS模式云原生数据仓库应用场景实践》
立即下载
《看见新力量:二》电子书
立即下载