hbase snapshot源码分析

本文涉及的产品
云原生网关 MSE Higress,422元/月
注册配置 MSE Nacos/ZooKeeper,118元/月
服务治理 MSE Sentinel/OpenSergo,Agent数量 不受限
简介: snapshot操作在硬盘上形式:/hbase/.snapshots /.tmp <---- working directory /[snapshot name] <---...

snapshot操作在硬盘上形式:

/hbase/.snapshots
       /.tmp                &lt;---- working directory
       /[snapshot name]     &lt;----- completed snapshot

当snapshot完成时的形式展示:

     /hbase/.snapshots/[snapshot name]
                .snapshotinfo          &lt;--- Description of the snapshot
                .tableinfo             &lt;--- Copy of the tableinfo
               /.logs
                     /[server_name]
                         /... [log files]
                      ...
                /[region name]           &lt;---- All the region's information
                .regioninfo              &lt;---- Copy of the HRegionInfo
                   /[column family name]
                       /[hfile name]     &lt;--- name of the hfile in the real region
                       ...
                   ...

snapshot基本步骤:

1.执行前会枷锁操作,不允许删除添加操作;

2.在hdfs在创建指定目录,写入相关的信息进去;

3.刷新memstore中的数据到hfile,

4.为hfile文件创建引用指针.

以下是大体的代码流程。

hbaseAdmin执行发起的snapshot:

    public void snapshot(final String snapshotName, final TableName tableName, SnapshotDescription.Type type) throws IOException,       SnapshotCreationException, IllegalArgumentException {
        SnapshotDescription.Builder builder = SnapshotDescription.newBuilder();
        builder.setTable(tableName.getNameAsString());
        builder.setName(snapshotName);
        builder.setType(type);
        snapshot(builder.build());
    }

执行快照并等待服务器完成该快照(阻止)。HBase实例一次只能有一个快照,或者结果可能是未定义(你可以告诉多个HBase集群同时快照,但只有一个在单个群集同时)。

    public void snapshot(SnapshotDescription snapshot) throws IOException, SnapshotCreationException, IllegalArgumentException {
        // actually take the snapshot
        SnapshotResponse response = takeSnapshotAsync(snapshot);

MasterRpcService:异步触发并完成一次snapshot:

        `master.snapshotManager.takeSnapshot(snapshot);`

SnapshotManager类:完成一次snapshot需要根据表的状态:disabled或者enabled

    if (assignmentMgr.getTableStateManager().isTableState(snapshotTable, ZooKeeperProtos.Table.State.ENABLED)) {
            LOG.debug("Table enabled, starting distributed snapshot.");
            snapshotEnabledTable(snapshot);
            LOG.debug("Started snapshot: " + ClientSnapshotDescriptionUtils.toString(snapshot));
        }
        // For disabled table, snapshot is created by the master
        else if (assignmentMgr.getTableStateManager().isTableState(snapshotTable, ZooKeeperProtos.Table.State.DISABLED)) {
            LOG.debug("Table is disabled, running snapshot entirely on master.");
            snapshotDisabledTable(snapshot);
            LOG.debug("Started snapshot: " + ClientSnapshotDescriptionUtils.toString(snapshot));
        } 

        private synchronized void snapshotEnabledTable(SnapshotDescription snapshot) throws HBaseSnapshotException {
        // setup the snapshot
        prepareToTakeSnapshot(snapshot);

        // Take the snapshot of the enabled table
        EnabledTableSnapshotHandler handler = new EnabledTableSnapshotHandler(snapshot, master, this);
        snapshotTable(snapshot, handler);
    }

enabled状态下执行表的snapshot:

        // setup the snapshot
        准备工作
        prepareToTakeSnapshot(snapshot);

        // Take the snapshot of the enabled table
        EnabledTableSnapshotHandler handler = new EnabledTableSnapshotHandler(snapshot, master, this);
        开始执行snapshot
        snapshotTable(snapshot, handler);
    }

snapshot开始之前的设置准备:检查是否有一个在运行的snapshot工作以及还原snapshot工作的请求存在。#

        // make sure we aren't already running a snapshot 
        if (isTakingSnapshot(snapshot)) {
            SnapshotSentinel handler = this.snapshotHandlers.get(snapshotTable);
            throw new SnapshotCreationException("Rejected taking " + ClientSnapshotDescriptionUtils.toString(snapshot) + " because we are already running another snapshot " + (handler != null ? ("on the same table " + ClientSnapshotDescriptionUtils.toString(handler.getSnapshot())) : "with the same name"), snapshot);
        }

        // make sure we aren't running a restore on the same table
        if (isRestoringTable(snapshotTable)) {
            SnapshotSentinel handler = restoreHandlers.get(snapshotTable);
            throw new SnapshotCreationException("Rejected taking " + ClientSnapshotDescriptionUtils.toString(snapshot) + " because we are already have a restore in progress on the same snapshot " + ClientSnapshotDescriptionUtils.toString(handler.getSnapshot()), snapshot);
        }

        try {
            // delete the working directory, since we aren't running the snapshot. Likely leftovers
            // from a failed attempt.
            fs.delete(workingDir, true);

            // recreate the working directory for the snapshot
            if (!fs.mkdirs(workingDir)) {
                throw new SnapshotCreationException("Couldn't create working directory (" + workingDir + ") for snapshot", snapshot);
            }

设置准备工作完成就开始进行snapshot用指定的handler进行snapshot工作:

            handler.prepare();
            this.executorService.submit(handler);
            this.snapshotHandlers.put(TableName.valueOf(snapshot.getTable()), handler);
            ...

TakeSnapshotHandler真正开始处理snapshot操作:

1.将snapshot描述信息写入.snapshotinfo目录

FsPermission perms = FSUtils.getFilePermissions(fs, fs.getConf(), HConstants.DATA_FILE_UMASK_KEY);
        Path snapshotInfo = new Path(workingDir, SnapshotDescriptionUtils.SNAPSHOTINFO_FILE);
        try {
            FSDataOutputStream out = FSUtils.create(fs, snapshotInfo, perms, true);
            try {
                snapshot.writeTo(out);
            } finally {
                out.close();
            }
        }

2.复制表的信息:

snapshotManifest.addTableDescriptor(this.htd);

3.获取hregionserver上的regions以及位置信息 ##:

List<Pair<HRegionInfo, ServerName>> regionsAndLocations;
            if (TableName.META_TABLE_NAME.equals(snapshotTable)) {
                regionsAndLocations = new MetaTableLocator().getMetaRegionsAndLocations(server.getZooKeeper());
            } else {
                regionsAndLocations = MetaTableAccessor.getTableRegionsAndLocations(server.getZooKeeper(), server.getConnection(), snapshotTable, false);
            }

4.开始执行snapshot操作,上面获取到的region信息及位置信息

 // run the snapshot
snapshotRegions(regionsAndLocations);
启动snapshot程序:::

在regionserver上开始snapshot // start the snapshot on the RS所有的snapshot操作的具体细节

    Procedure proc = coordinator.startProcedure(this.monitor, this.snapshot.getName(), this.snapshot.toByteArray(), 

    Lists.newArrayList(regionServers));
    if (proc == null) {
        String msg = "Failed to submit distributed procedure for snapshot '" + snapshot.getName() + "'";
        LOG.error(msg);
        throw new HBaseSnapshotException(msg);
    }

等待snapshot完成:

proc.waitForCompleted();

将下线的region作为disabled处理

// Take the offline regions as disabled
        for (Pair<HRegionInfo, ServerName> region : regions) {
            HRegionInfo regionInfo = region.getFirst();
            if (regionInfo.isOffline() && (regionInfo.isSplit() || regionInfo.isSplitParent())) {
                LOG.info("Take disabled snapshot of offline region=" + regionInfo);
                snapshotDisabledRegion(regionInfo);
            }
        }

5.相关region信息以及servername,用来验证snapshot的有效性

// extract each pair to separate lists
            Set<String> serverNames = new HashSet<String>();
            for (Pair<HRegionInfo, ServerName> p : regionsAndLocations) {
                if (p != null && p.getFirst() != null && p.getSecond() != null) {
                    HRegionInfo hri = p.getFirst();
                    if (hri.isOffline() && (hri.isSplit() || hri.isSplitParent()))
                        continue;
                    serverNames.add(p.getSecond().toString());
                }
            }

6.刷新内存状态,写snapshot-mnifest信息到目录

// flush the in-memory state, and write the single manifest
            status.setStatus("Consolidate snapshot: " + snapshot.getName());
            snapshotManifest.consolidate();

7.开始验证snapshot的有效性

// verify the snapshot is valid
            status.setStatus("Verifying snapshot: " + snapshot.getName());
            verifier.verifySnapshot(this.workingDir, serverNames);

8.完成snapshot,转移目录等

// complete the snapshot, atomically moving from tmp to .snapshot dir.
completeSnapshot(this.snapshotDir, this.workingDir, this.fs);
msg = "Snapshot " + snapshot.getName() + " of table " + snapshotTable + " completed";
status.markComplete(msg);
LOG.info(msg);
metricsSnapshot.addSnapshot(status.getCompletionTimestamp() - status.getStartTime());
相关实践学习
云数据库HBase版使用教程
&nbsp; 相关的阿里云产品:云数据库 HBase 版 面向大数据领域的一站式NoSQL服务,100%兼容开源HBase并深度扩展,支持海量数据下的实时存储、高并发吞吐、轻SQL分析、全文检索、时序时空查询等能力,是风控、推荐、广告、物联网、车联网、Feeds流、数据大屏等场景首选数据库,是为淘宝、支付宝、菜鸟等众多阿里核心业务提供关键支撑的数据库。 了解产品详情:&nbsp;https://cn.aliyun.com/product/hbase &nbsp; ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库&nbsp;ECS 实例和一台目标数据库&nbsp;RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&amp;RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
目录
相关文章
|
分布式数据库 Hbase Java
hbase region split源码分析
hbase region split : split执行调用流程: 1.HbaseAdmin发起split:### 2.RSRpcServices实现类执行split(Implements the regionserver RPC services.)### 3.CompactSplitThread类与SplitRequest类用来执行region切割:### 4.splitRequest执行doSplitting操作### 4.1初始化两个子region### 4.2执行切割#### 4.2.1:(创建子region。
1773 0
|
存储 缓存 分布式数据库
HBase源码分析之Region定位
        我们知道,HBase是一个基于RowKey进行检索的分布式数据库。它按照行的方向将表中的数据切分成一个个Region,而每个Region都会存在一个起始行StartKey和一个终止行EndKey。
1960 0
|
分布式数据库 Hbase 存储
HBase源码分析之HRegion上compact流程分析(一)
        首先来想两个问题:1、何谓compact?2、它产生的背景是怎样的?         compact是指HBase表中HRegion上某个Column Family下,部分或全部HFiles的合并。
1030 1
|
存储 分布式数据库 Hbase
HBase源码分析之KeyValue
        HBase内部,单元格Cell的实现为KeyValue,它是HBase某行数据的某个单元格在内存中的组织形式,由Key Length、Value Length、Key、Value四大部分组成。
990 0
|
分布式数据库 Hbase
HBase源码分析之Region上Spilt流程
        HBase源码分析之Region上Spilt流程,近期推出!
880 0
|
分布式数据库 Hbase
HBase源码分析之Region合并merge
        HBase源码分析之Region合并merge,近期推出!
1290 0
|
分布式数据库 Hbase
HBase源码分析之Region上线
        HBase源码分析之Region上线,近期推出!
1593 0
|
分布式数据库 Hbase
HBase源码分析之Region下线
        HBase源码分析之Region下线,近期推出!
1191 0
|
分布式数据库 调度 Hbase
HBase源码分析之事件处理模型
        HBase是一个复杂的分布式非结构化数据库,它将表中的数据按照行的方向切分成一个个的Region,并在若干RegionServer上上线,依靠所在RegionServer对外提供数据读写IO服务。
896 0
|
分布式数据库 Hbase 存储
HBase源码分析之HRegionServer上的MovedRegionsCleaner工作线程
        MovedRegionsCleaner是什么呢?我们先来看下它在HRegionServer上的定义: /** * Chore to clean periodically the moved region list * 被移动Region列表的定期清理工作线程 */ private MovedRegionsCleaner movedRegionsCleaner;        原来它是HRegionServer上一个被移动Region列表的定期清理工作线程。
897 0