对PostgreSQL xmax的理解

本文涉及的产品
云原生数据库 PolarDB MySQL 版,通用型 2核8GB 50GB
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
简介:

xmax

The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version. It is possible for this column to be nonzero in a visible row version. That usually indicates that the deleting transaction hasn't committed yet, or that an attempted deletion was rolled back.

http://www.postgresql.org/docs/9.1/static/ddl-system-columns.html

作一个实验:

我数据库关闭前,最后一条transaction的id是 1874。

我开一个终端A,此时终端A的当前transactionId为 1875。

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# begin;
BEGIN

pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
 1874 |    0 |  9 | 9
(7 rows)

pgsql=# 
复制代码

我再开一个终端B,此时,终端B的transactionId为:1876。

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# begin;
BEGIN
pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
 1874 |    0 |  9 | 9
(7 rows)

pgsql=# 
复制代码

回到终端A,执行 delete 操作:

pgsql=# delete from tab01 where id=9;
DELETE 1
pgsql=# 

此时,在终端A中,已经看不到删除后的数据:

复制代码
pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
(6 rows)

pgsql=# 
复制代码

此时,由于终端A尚未提交,所以,可以在终端B中看到如下的情形:

复制代码
pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
 1874 | 1875 |  9 | 9
(7 rows)

pgsql=# 
复制代码

也就是说,id为9的那条记录,其xmax为1875,表明其为 transactionid为 1875的事务所删除。

回到终端A,进行提交:

复制代码
pgsql=# commit;
COMMIT
pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
(6 rows)

pgsql=# 
复制代码

再回到终端B,查看:

复制代码
pgsql=# select xmin,xmax,* from tab01;
 xmin | xmax | id | cd 
------+------+----+----
 1866 |    0 |  3 | 3
 1867 |    0 |  4 | 4
 1868 |    0 |  5 | 5
 1870 |    0 |  6 | 6
 1872 |    0 |  7 | 7
 1873 |    0 |  8 | 8
(6 rows)

pgsql=# 
复制代码

让我来再进一步,看一看:

重新开两个终端:

终端A和终端B。

在终端A中:

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# select xmin,xmax,cmin,cmax,* from tab01;
 xmin | xmax | cmin | cmax | id | cd 
------+------+------+------+----+----
 1878 |    0 |    0 |    0 |  1 | 1
 1879 |    0 |    0 |    0 |  2 | 2
(2 rows)

pgsql=# \q
复制代码

可以看到两条由不同的事务提交所形成的记录。

然后再次使用psql: 经过update 后,自己所看到的是 xmin的变化,这时尚未提交,别的终端看到就不一样了。

终端A:

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# begin;
BEGIN
pgsql=# update tab01 set id=3 where cd='2';
UPDATE 1
pgsql=# select xmin,xmax,cmin,cmax,* from tab01;
 xmin | xmax | cmin | cmax | id | cd 
------+------+------+------+----+----
 1878 |    0 |    0 |    0 |  1 | 1
 1880 |    0 |    0 |    0 |  3 | 2
(2 rows)

pgsql=# 
复制代码

此时,进入终端B:

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# select xmin,xmax,cmin,cmax,* from tab01;
 xmin | xmax | cmin | cmax | id | cd 
------+------+------+------+----+----
 1878 |    0 |    0 |    0 |  1 | 1
 1879 | 1880 |    0 |    0 |  2 | 2
(2 rows)

pgsql=# 
复制代码

我推测,update的时候,自身事务所看到的是内存中的影像。同时它也已经提交到了物理文件上。而别的事务是从物理文件来读取的。

在A终端提交以后:

pgsql=# commit;
COMMIT
pgsql=# 

B终端所看到的:

复制代码
pgsql=# select xmin,xmax,cmin,cmax,* from tab01;
 xmin | xmax | cmin | cmax | id | cd 
------+------+------+------+----+----
 1878 |    0 |    0 |    0 |  1 | 1
 1880 |    0 |    0 |    0 |  3 | 2
(2 rows)

pgsql=# 
复制代码

 继续从代码上进行分析:

 

复制代码
/*
 *    heap_update - replace a tuple
 *
 * NB: do not call this directly unless you are prepared to deal with
 * concurrent-update conditions.  Use simple_heap_update instead.
 *
 *    relation - table to be modified (caller must hold suitable lock)
 *    otid - TID of old tuple to be replaced
 *    newtup - newly constructed tuple data to store
 *    ctid - output parameter, used only for failure case (see below)
 *    update_xmax - output parameter, used only for failure case (see below)
 *    cid - update command ID (used for visibility test, and stored into
 *        cmax/cmin if successful)
 *    crosscheck - if not InvalidSnapshot, also check old tuple against this
 *    wait - true if should wait for any conflicting update to commit/abort
 *
 * Normal, successful return value is HeapTupleMayBeUpdated, which
 * actually means we *did* update it.  Failure return codes are
 * HeapTupleSelfUpdated, HeapTupleUpdated, or HeapTupleBeingUpdated
 * (the last only possible if wait == false).
 *
 * On success, the header fields of *newtup are updated to match the new
 * stored tuple; in particular, newtup->t_self is set to the TID where the
 * new tuple was inserted, and its HEAP_ONLY_TUPLE flag is set iff a HOT
 * update was done.  However, any TOAST changes in the new tuple's
 * data are not reflected into *newtup.
 *
 * In the failure cases, the routine returns the tuple's t_ctid and t_xmax.
 * If t_ctid is the same as otid, the tuple was deleted; if different, the
 * tuple was updated, and t_ctid is the location of the replacement tuple.
 * (t_xmax is needed to verify that the replacement tuple matches.)
 */
HTSU_Result
heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
            ItemPointer ctid, TransactionId *update_xmax,
            CommandId cid, Snapshot crosscheck, bool wait)
{
    HTSU_Result result;
    TransactionId xid = GetCurrentTransactionId();
    Bitmapset  *hot_attrs;
    ItemId        lp;
    HeapTupleData oldtup;
    HeapTuple    heaptup;
    Page        page;
    Buffer        buffer,
                newbuf;
    bool        need_toast,
                already_marked;
    Size        newtupsize,
                pagefree;
    bool        have_tuple_lock = false;
    bool        iscombo;
    bool        use_hot_update = false;
    bool        all_visible_cleared = false;
    bool        all_visible_cleared_new = false;

    ...

    ////////////////First Phase marked by gaojian

    newtup->t_data->t_infomask &= ~(HEAP_XACT_MASK);
    newtup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK);
    newtup->t_data->t_infomask |= (HEAP_XMAX_INVALID | HEAP_UPDATED);
    HeapTupleHeaderSetXmin(newtup->t_data, xid);
    HeapTupleHeaderSetCmin(newtup->t_data, cid);
    HeapTupleHeaderSetXmax(newtup->t_data, 0);    /* for cleanliness */
    newtup->t_tableOid = RelationGetRelid(relation);

    ...
if (!already_marked) { /* Clear obsolete visibility flags ... */ oldtup.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID | HEAP_XMAX_IS_MULTI | HEAP_IS_LOCKED | HEAP_MOVED); /* ... and store info about transaction updating this tuple */ ///HeapTupleHeaderSetXmax(oldtup.t_data, xid); /////>>>>>added by gaojian for testing. ////xid = (TransactionId)8888; fprintf(stderr,"x-----1,xid is :%d \n",(int)xid); HeapTupleHeaderSetXmax(oldtup.t_data, xid); HeapTupleHeaderSetCmax(oldtup.t_data, cid, iscombo); sleep(60); } ... return HeapTupleMayBeUpdated; }
复制代码

 

可以看到,第一段的 :

复制代码
    newtup->t_data->t_infomask &= ~(HEAP_XACT_MASK);
    newtup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK);
    newtup->t_data->t_infomask |= (HEAP_XMAX_INVALID | HEAP_UPDATED);
    HeapTupleHeaderSetXmin(newtup->t_data, xid);
    HeapTupleHeaderSetCmin(newtup->t_data, cid);
    HeapTupleHeaderSetXmax(newtup->t_data, 0);    /* for cleanliness */
    newtup->t_tableOid = RelationGetRelid(relation);
复制代码

写的很明白, HeapTupleHeaderSetXmin(newtup->t_data, xid);

而第二段的 :

复制代码
    if (!already_marked)
    {
        /* Clear obsolete visibility flags ... */
        oldtup.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED |
                                       HEAP_XMAX_INVALID |
                                       HEAP_XMAX_IS_MULTI |
                                       HEAP_IS_LOCKED |
                                       HEAP_MOVED);
        /* ... and store info about transaction updating this tuple */

        HeapTupleHeaderSetXmax(oldtup.t_data, xid);
        HeapTupleHeaderSetCmax(oldtup.t_data, cid, iscombo);
    }
复制代码

然后,我再来验证一下,加点调试代码,这样,我执行sql时会出错:

复制代码
    if (!already_marked)
    {
        /* Clear obsolete visibility flags ... */
        oldtup.t_data->t_infomask &= ~(HEAP_XMAX_COMMITTED |
                                       HEAP_XMAX_INVALID |
                                       HEAP_XMAX_IS_MULTI |
                                       HEAP_IS_LOCKED |
                                       HEAP_MOVED);
        /* ... and store info about transaction updating this tuple */

        ///HeapTupleHeaderSetXmax(oldtup.t_data, xid);
        /////>>>>>added by gaojian for testing.
        ////xid = (TransactionId)8888;

        fprintf(stderr,"x-----1,xid is :%d \n",(int)xid);

        HeapTupleHeaderSetXmax(oldtup.t_data, xid);
        HeapTupleHeaderSetCmax(oldtup.t_data, cid, iscombo);

    }
复制代码

执行结果:

复制代码
pgsql=# begin;
BEGIN
pgsql=# update tab01 set id=2 where cd = '2';
ERROR:  could not access status of transaction 8888
DETAIL:  Could not read from file "pg_subtrans/0000" at offset 32768: Success.
pgsql=# \q
[pgsql@loca
复制代码

再次登陆进来看看,哈,这证明我所看到的代码部分,就是实现写入xmax的部分。

总结来说,就是,在update(可能delete也是),用向oldtup写入transaction id的方式,来记录哪个transaction改动了记录。

复制代码
[pgsql@localhost bin]$ ./psql
psql (9.1.2)
Type "help" for help.

pgsql=# select xmin,xmax,cmin,cmax, * from tab01;
 xmin | xmax | cmin | cmax |    id     | cd 
------+------+------+------+-----------+----
 1878 |    0 |    0 |    0 |         1 | 1
 1884 | 8888 |    0 |    0 | 999888777 | 2
 1885 |    0 |    0 |    0 |         2 | 2
(3 rows)

pgsql=# 
复制代码
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍如何基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
测试技术 开发工具 iOS开发
iOS自动化测试方案(三):WDA+iOS自动化测试解决方案
这篇文章是iOS自动化测试方案的第三部分,介绍了在没有MacOS系统条件下,如何使用WDA(WebDriverAgent)结合Python客户端库facebook-wda和tidevice工具,在Windows系统上实现iOS应用的自动化测试,包括环境准备、问题解决和扩展应用的详细步骤。
1786 1
iOS自动化测试方案(三):WDA+iOS自动化测试解决方案
|
11月前
|
消息中间件 存储 Prometheus
Kafka集群如何配置高可用性
Kafka集群如何配置高可用性
305 1
|
12月前
|
机器学习/深度学习 人工智能 架构师
Python学习圣经:从0到1,精通Python使用
尼恩架构团队的大模型《LLM大模型学习圣经》是一个系统化的学习系列,初步规划包括以下内容: 1. **《Python学习圣经:从0到1精通Python,打好AI基础》** 2. **《LLM大模型学习圣经:从0到1吃透Transformer技术底座》**
Python学习圣经:从0到1,精通Python使用
|
SQL 容灾 关系型数据库
[版本更新] PolarDB-X V2.4 列存引擎开源正式发布
[版本更新] PolarDB-X V2.4 列存引擎开源正式发布!
[版本更新] PolarDB-X V2.4 列存引擎开源正式发布
|
监控 安全 项目管理
员工上网行为监控丨管理员工上网要这样做
在互联网时代,有效管理员工上网行为至关重要。WorkWin监控系统支持移动部署和全球云监控,实现权限精细控制,确保工作安全高效。实时监控与时间统计功能优化资源利用,远程控制提升管理效率。Time Doctor通过视频截图和键盘活动分析员工效率,可与项目管理工具集成。ActivTrak提供行为分析报告,适合团队监控。这些工具助力企业保障信息安全,提高工作效率。
279 2
|
缓存 前端开发 JavaScript
React 必学SSR框架——next.js
​ 首先我们就回顾一下,我们到底是怎么告别了使用 php/jsp 做服务器端渲染,进入前后端分离的客户端渲染时代,又为什么重新回到了服务端渲染。
2928 0
|
JavaScript
Vue 学习记录,从难受到真香
Vue 学习记录,从难受到真香
83 2
|
安全 网络安全 数据安全/隐私保护
数据流通价值显现!瓴羊携手上海社科院发布业内首份数据要素流通研究报告
数据流通价值显现!瓴羊携手上海社科院发布业内首份数据要素流通研究报告
149 0
|
存储 NoSQL 关系型数据库
PostgreSQL中的变长数据结构Varlena
在本文是PostgreSQL学习系列---基础数据类型的第二篇,对PG中的Varlena类型进行了介绍,内容还会随着学习进行不断补充~
1781 0