标签
PostgreSQL , online ddl , trigger , ddl 事务
背景
有张表的主键id是serial,但现在不够了,需要升级成bigserial,有什么优雅的方法吗?我看下来好像会锁表很久(因为数据量挺大)
如果直接alter table,由于数据类型从4字节改成了8字节,而tuple结构是在METADATA里面的,不是每行都有,所以DEFORM需要依赖METADATA,目前来说,这种操作需要rewrite table。
那么如何做到改bigserial,整个过程不堵塞dml呢?
有一张父表 bigtable,两个字段 id int primary key, x bigint,以及两张子表 child1 和 child2,数据量在一亿以上,通过脚本,可以改成 bigint 。整个过程不会锁住表,能正常CRUD。
alter sequence bigtable_id_seq as bigint;
alter table bigtable add column new_id bigint;
create unique index concurrently on bigtable (new_id);
create unique index concurrently on child1 (new_id);
create unique index concurrently on child2 (new_id);
create or replace function id_handler() returns trigger as $$
begin
new.new_id = new.id; -- id 是int4类型 , newid是int8类型
return new;
end;
$$ language plpgsql;
create trigger id_trigger before insert on bigtable for each row execute procedure id_handler();
create trigger id_trigger before insert on child1 for each row execute procedure id_handler();
create trigger id_trigger before insert on child2 for each row execute procedure id_handler();
-- 更新历史数据
update bigtable set new_id = id where 分段; -- 分解成多条
-- update bigtable set new_id = id where id between 1 and 10000 and new_id<>id;
-- update bigtable set new_id = id where id between 10001 and 20000 and new_id<>id;
-- ...
-- 历史数据更新完成后,切换ID字段
begin;
alter table bigtable
alter column id drop default,
alter column new_id set default nextval('bigtable_id_seq'::regclass);
create or replace function id_handler() returns trigger as $$
begin
new.id = new.new_id; -- 业务上不会直接去写ID的值。
return new;
end;
$$ language plpgsql;
end;
update pg_catalog.pg_attribute set attnotnull = true where attrelid in (select oid from pg_catalog.pg_class where relname in ('bigtable', 'child1', 'child2')) and attname = 'new_id';
alter table bigtable
drop constraint bigtable_pkey,
add constraint bigtable_pkey primary key using index bigtable_new_id_idx;
alter table child1
drop constraint child1_pkey,
add constraint child1_pkey primary key using index child1_new_id_idx;
alter table child2
drop constraint child2_pkey,
add constraint child2_pkey primary key using index child2_new_id_idx;
begin;
alter table bigtable rename id to old_id;
alter table bigtable rename new_id to id;
alter table bigtable alter column old_id drop not null;
drop trigger id_trigger on bigtable;
drop trigger id_trigger on child1;
drop trigger id_trigger on child2;
commit;
alter sequence bigtable_id_seq owned by bigtable.id;
alter table bigtable drop column old_id;
之前唯一卡住的是要把新的列设置成主键时,会自动设置 set not null,这一步会锁表,然后检查是否有 null 值;但通过之前的步骤,其实没有 null 值了,所以我就手工修改了 attributes 这张表。
update pg_catalog.pg_attribute set attnotnull = true where attrelid in (select oid from pg_catalog.pg_class where relname in ('bigtable', 'child1', 'child2')) and attname = 'new_id';
明天我前同事会在线上用 17亿数据测试一下效果,如果也能顺利完成的话,那就可以考虑做成一个通用的脚本了
注意:
直接修改pg_catalog.pg_attribute元数据,注意一下prepared statement,看看PS缓存是否会更新,直接改元数据的方式的情况下,可能约束检查不一定会实时传播给其他会话,比如业务上直接写NULL值。
因为前面已经设置了 new_id 的默认值,本身是不会出现 null 值的。除非用户在SQL中可能直接写null,否则这个值是不会为NULL的。