接上篇:https://developer.aliyun.com/article/1623005?spm=a2c6h.13148508.setting.18.49764f0eF83epA
ReplicatedMergeTree原理
数据结构 [zk: localhost:2181(CONNECTED) 7] ls /clickhouse/tables/01/replicated_sales_5 [alter_partition_version, block_numbers, blocks, columns, leader_election, log, metadata, mutations, nonincrement_block_numbers, part_moves_shard, pinned_part_uuids, quorum, replicas, table_shared_id, temp, zero_copy_hdfs, zero_copy_s3] [zk: localhost:2181(CONNECTED) 8] ———————————————— 版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。 原文链接:https://blog.csdn.net/w776341482/article/details/142374328
元数据:
metadata:元数信息 主键、采样表达式、分区键
columns:列的字段的数据类型、字段名
replicats:副本的名称
标志:
leader_eletion:主副本的选举路径
blocks:hash值(复制数据重复插入)、partition_id
max_insert_block_size: 1048576行
block_numbers:在同一分区下block的顺序
quorum:副本的数据量
操作类:
log:log-000000 常规操作
mutations:delete update
创建新表1
在当前机器上建立新表:
CREATE TABLE a1( id String, price Float64, create_time DateTime ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/a1', 'h121.wzk.icu') PARTITION BY toYYYYMM(create_time) ORDER BY id;
- 根据zk_path初始化所有的zk节点
- 在replicas节点下注册自己的副本实例 h121.wzk.icu
- 启动监听任务 监听LOG日志节点
- 参与副本选举,选出主副本,选举的方式是向 leader_election 插入子节点,第一个插入成功的副本就是主副本
执行结果如下图所示:
创建新表2
创建第二个副本实例(注意,当前我们需要连接到 h122 节点上):
clickhouse-client -m --host h122.wzk.icu --port 9001 --user default --password clickhouse@wzk.icu
执行对应的 SQL:
CREATE TABLE a1( id String, price Float64, create_time DateTime ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/a1', 'h122.wzk.icu') PARTITION BY toYYYYMM(create_time) ORDER BY id;
执行的结果如下图所示:
此时参与副本选举,h121.wzk.icu 副本成为了主副本。
插入数据1
目前我们在 h121.wzk.icu 插入数据:
insert into table a1 values('A001',100,'2024-08-20 08:00:00'); • 1
执行上述内容结果为:
目前我们在 h121.wzk.icu 插入数据:
insert into table a1 values('A001',100,'2024-08-20 08:00:00'); • 1
执行上述内容结果为:
输出了如下的内容,插入命令执行后,在本地完成分区的目录的写入,接着向Block写入该分区的block_id:
[zk: localhost:2181(CONNECTED) 6] ls /clickhouse/tables/01/a1/blocks [202408_16261221490105862188_1058020630609096934] [zk: localhost:2181(CONNECTED) 7]
查看日志
接下来,h121.wzk.icu 副本发起向 log 日志推送操作日志:
[zk: localhost:2181(CONNECTED) 7] ls /clickhouse/tables/01/a1/log [log-0000000000] [zk: localhost:2181(CONNECTED) 8]
再次插入一条数据:
查看 LOG 日志: ls /clickhouse/tables/01/a1/log get /clickhouse/tables/01/a1/log/log-0000000000 get /clickhouse/tables/01/a1/log/log-0000000001
输出内容如下:
[zk: localhost:2181(CONNECTED) 14] ls /clickhouse/tables/01/a1/log [log-0000000000, log-0000000001] [zk: localhost:2181(CONNECTED) 13] get /clickhouse/tables/01/a1/log/log-0000000000 format version: 4 create_time: 2024-08-01 17:10:35 source replica: h121.wzk.icu block_id: 202408_16261221490105862188_1058020630609096934 get 202408_0_0_0 part_type: Compact [zk: localhost:2181(CONNECTED) 16] get /clickhouse/tables/01/a1/log/log-0000000001 format version: 4 create_time: 2024-08-01 17:16:37 source replica: h121.wzk.icu block_id: 202408_3260633639629896920_11326802927295833243 get 202408_1_1_0 part_type: Compact
拉取日志
接下来,第二个副本拉取Log日志:
h122.wzk.icu节点会一直监听 /log 节点的变化,当h121.wzk.icu推送了/log/log-000000、0000001之后,h122.wzk.icu节点便会触发日志的拉取任务,并更新 log_pointer。
[zk: localhost:2181(CONNECTED) 18] ls /clickhouse/tables/01/a1/replicas [h121.wzk.icu, h122.wzk.icu] [zk: localhost:2181(CONNECTED) 19] ls /clickhouse/tables/01/a1/replicas/h122.wzk.icu [columns, flags, host, is_active, is_lost, log_pointer, max_processed_insert_time, metadata, metadata_version, min_unprocessed_insert_time, mutation_pointer, parts, queue] [zk: localhost:2181(CONNECTED) 20] ls /clickhouse/tables/01/a1/replicas/h122.wzk.icu/log_pointer [] [zk: localhost:2181(CONNECTED) 21] get /clickhouse/tables/01/a1/replicas/h122.wzk.icu/log_pointer 2 [zk: localhost:2181(CONNECTED) 22] get /clickhouse/tables/01/a1/replicas/h121.wzk.icu/log_pointer 2 [zk: localhost:2181(CONNECTED) 23]
执行结果如下图所示: