PostgreSQL 在线修改数据类型 - online ddl 方法之一-阿里云开发者社区

PostgreSQL 在线修改数据类型 - online ddl 方法之一

2019-04-27 4017

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS SQL Server Serverless，2-4RCU 50GB 3个月

云数据库 RDS MySQL，集群系列 2核4GB

云原生数据库 PolarDB 分布式版，标准版 2核8GB

简介： 标签PostgreSQL , online ddl , trigger , ddl 事务背景有张表的主键id是serial，但现在不够了，需要升级成bigserial，有什么优雅的方法吗？我看下来好像会锁表很久（因为数据量挺大）如果直接alter table，由于数据类型从4字节改成了8字节，而tuple结构是在METADATA里面的，不是每行都有，所以DEFORM需要依赖METADATA，目前来说，这种操作需要rewrite table。

背景

有张表的主键id是serial，但现在不够了，需要升级成bigserial，有什么优雅的方法吗？我看下来好像会锁表很久（因为数据量挺大）

如果直接alter table，由于数据类型从4字节改成了8字节，而tuple结构是在METADATA里面的，不是每行都有，所以DEFORM需要依赖METADATA，目前来说，这种操作需要rewrite table。

那么如何做到改bigserial，整个过程不堵塞dml呢？

有一张父表 bigtable，两个字段 id int primary key, x bigint，以及两张子表 child1 和 child2，数据量在一亿以上，通过脚本，可以改成 bigint 。整个过程不会锁住表，能正常CRUD。

alter sequence bigtable_id_seq as bigint;  
  
alter table bigtable add column new_id bigint;  
create unique index concurrently on bigtable (new_id);  
create unique index concurrently on child1 (new_id);  
create unique index concurrently on child2 (new_id);  
  
create or replace function id_handler() returns trigger as $$  
begin  
  new.new_id = new.id;  -- id 是int4类型 , newid是int8类型   
  return new;  
end;  
$$ language plpgsql;  
create trigger id_trigger before insert on bigtable for each row execute procedure id_handler();  
create trigger id_trigger before insert on child1 for each row execute procedure id_handler();  
create trigger id_trigger before insert on child2 for each row execute procedure id_handler();  
  
-- 更新历史数据  
update bigtable set new_id = id where 分段; -- 分解成多条  
-- update bigtable set new_id = id where id between 1 and 10000 and new_id<>id;  
-- update bigtable set new_id = id where id between 10001 and 20000 and new_id<>id;  
-- ...  
  
-- 历史数据更新完成后，切换ID字段  
  
begin;  
alter table bigtable  
  alter column id drop default,  
  alter column new_id set default nextval('bigtable_id_seq'::regclass);  
  
create or replace function id_handler() returns trigger as $$  
begin  
  new.id = new.new_id;   -- 业务上不会直接去写ID的值。  
  return new;  
end;  
$$ language plpgsql;  
end;  
  
update pg_catalog.pg_attribute set attnotnull = true where attrelid in (select oid from pg_catalog.pg_class where relname in ('bigtable', 'child1', 'child2')) and attname = 'new_id';  
  
alter table bigtable  
  drop constraint bigtable_pkey,  
  add constraint bigtable_pkey primary key using index bigtable_new_id_idx;  
alter table child1  
  drop constraint child1_pkey,  
  add constraint child1_pkey primary key using index child1_new_id_idx;  
alter table child2  
  drop constraint child2_pkey,  
  add constraint child2_pkey primary key using index child2_new_id_idx;  
  
begin;  
alter table bigtable rename id to old_id;  
alter table bigtable rename new_id to id;  
alter table bigtable alter column old_id drop not null;  
drop trigger id_trigger on bigtable;  
drop trigger id_trigger on child1;  
drop trigger id_trigger on child2;  
commit;  
  
alter sequence bigtable_id_seq owned by bigtable.id;  
  
alter table bigtable drop column old_id;

之前唯一卡住的是要把新的列设置成主键时，会自动设置 set not null，这一步会锁表，然后检查是否有 null 值；但通过之前的步骤，其实没有 null 值了，所以我就手工修改了 attributes 这张表。

update pg_catalog.pg_attribute set attnotnull = true where attrelid in (select oid from pg_catalog.pg_class where relname in ('bigtable', 'child1', 'child2')) and attname = 'new_id';

明天我前同事会在线上用 17亿数据测试一下效果，如果也能顺利完成的话，那就可以考虑做成一个通用的脚本了

注意：

直接修改pg_catalog.pg_attribute元数据，注意一下prepared statement，看看PS缓存是否会更新，直接改元数据的方式的情况下，可能约束检查不一定会实时传播给其他会话，比如业务上直接写NULL值。

因为前面已经设置了 new_id 的默认值，本身是不会出现 null 值的。除非用户在SQL中可能直接写null，否则这个值是不会为NULL的。

PostgreSQL 在线修改数据类型 - online ddl 方法之一

标签

背景

关系型数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像