irpas技术客

2021SC@SDUSC HBase(十二)项目代码分析——snapshot_Xyxxi

网络 7357

2021SC@SDUSC

目录 一、简述二、实现RestoreSnapshotHandler 三、总结

一、简述

上一篇讲述了snapshot的使用,包括snapshot的基础原理、实现(启用的表以及被禁用的表)、功能,这一篇讲述怎么从snapshot中恢复表。

二、实现

恢复表的方法是restoreSnapshot,这个方法位于HMaster中。

if (MetaReader.tableExists(master.getCatalogTracker(), tableName)) { if (master.getAssignmentManager().getZKTable().isEnabledTable(TableName.valueOf(fsSnapshot.getTable()))) { throw new UnsupportedOperationException("Table '" +TableName.valueOf(fsSnapshot.getTable()) + "' must be disabled in order to " +"perform a restore operation" +"."); } restoreSnapshot(fsSnapshot, snapshotTableDesc); } else { HTableDescriptor htd = RestoreSnapshotHelper.cloneTableSchema(snapshotTableDesc, tableName); cloneSnapshot(fsSnapshot, htd); }

该方法调用了SnapshotManager的restoreSnapshot方法。首先检查meta表当中是否存在该表,并设置不能对在线的表进行恢复操作。从snapshot恢复表,通过提交RestoreSnapshotHandler。恢复之前先判断这个表还在不在,有可能表都被删除掉了,分开两种情况处理,但是我们也可以看到它只是通过两个handler去处理了,走的是线程池提交handler。直接看RestoreSnapshotHandler和CloneSnapshotHandler的handleTableOperation方法。

RestoreSnapshotHandler protected void handleTableOperation(List hris) throws IOException { MasterFileSystem fileSystemManager = masterServices.getMasterFileSystem(); CatalogTracker catalogTracker = masterServices.getCatalogTracker(); FileSystem fs = fileSystemManager.getFileSystem(); Path rootDir = fileSystemManager.getRootDir(); TableName tableName = hTableDescriptor.getTableName(); try { this.masterServices.getTableDescriptors().add(hTableDescriptor); Path snapshotDir = SnapshotDescriptionUtils.getCompletedSnapshotDir(snapshot, rootDir); RestoreSnapshotHelper restoreHelper = new RestoreSnapshotHelper( masterServices.getConfiguration(), fs, snapshot, snapshotDir, hTableDescriptor, rootDir, monitor, status); RestoreSnapshotHelper.RestoreMetaChanges metaChanges = restoreHelper.restoreHdfsRegions(); forceRegionsOffline(metaChanges); List hrisToRemove = new LinkedList(); if (metaChanges.hasRegionsToRemove()) hrisToRemove.addAll(metaChanges.getRegionsToRemove()); if (metaChanges.hasRegionsToRestore()) hrisToRemove.addAll(metaChanges.getRegionsToRestore()); MetaEditor.deleteRegions(catalogTracker, hrisToRemove); hris.clear(); if (metaChanges.hasRegionsToAdd()) hris.addAll(metaChanges.getRegionsToAdd()); if (metaChanges.hasRegionsToRestore()) hris.addAll(metaChanges.getRegionsToRestore()); MetaEditor.addRegionsToMeta(catalogTracker, hris); metaChanges.updateMetaParentRegions(catalogTracker, hris); } catch (IOException e) { String msg = "restore snapshot=" + ClientSnapshotDescriptionUtils.toString(snapshot) + " failed. Try re-running the restore command.";throw new RestoreSnapshotException(msg, e); } }

RestoreSnapshotHandler方法可以看到主要包括四个步骤

更新表的定义:this.masterServices.getTableDescriptors().add(hTableDescriptor);用snapshot当中的表定义来覆盖现在的表定义恢复region: Path snapshotDir = SnapshotDescriptionUtils.getCompletedSnapshotDir(snapshot, rootDir);找到snapshot的地址,使用restoreHelper开始恢复把变化了的region在RS端的RegionStates里面强制下线,否则会出现region在恢复之前是split状态的再也无法被分配的情况: forceRegionsOffline(metaChanges);更改的变化的region的RegionStates为offline状态修改meta表当中的region记录,根据新增和删除的两种情况来处理: 4.1 List hrisToRemove = new LinkedList();把那些删除了的region在meta表里面也删除掉,MetaEditor.deleteRegions(catalogTracker, hrisToRemove);删除meta表当中的region 4.2添加新增的region到META表:hris.clear();先清理,if (metaChanges.hasRegionsToAdd()) hris.addAll(metaChanges.getRegionsToAdd());再把新的加进去,if (metaChanges.hasRegionsToRestore())删掉旧的,再添加回来。 恢复region的过程: public RestoreMetaChanges restoreHdfsRegions() throws IOException { LOG.debug("starting restore"); Set snapshotRegionNames = SnapshotReferenceUtil.getSnapshotRegionNames(fs, snapshotDir); RestoreMetaChanges metaChanges = new RestoreMetaChanges(parentsMap); List tableRegions = getTableRegions(); if (tableRegions != null) { //for (HRegionInfo regionInfo: tableRegions) { String regionName = regionInfo.getEncodedName(); if (snapshotRegionNames.contains(regionName)) { snapshotRegionNames.remove(regionName); metaChanges.addRegionToRestore(regionInfo); } else { metaChanges.addRegionToRemove(regionInfo); } } restoreHdfsRegions(metaChanges.getRegionsToRestore());    removeHdfsRegions(metaChanges.getRegionsToRemove()); } if (snapshotRegionNames.size() > 0) { List regionsToAdd = new LinkedList(); for (String regionName: snapshotRegionNames) { Path regionDir = new Path(snapshotDir, regionName); regionsToAdd.add(HRegionFileSystem.loadRegionInfoFileContent(fs, regionDir)); } HRegionInfo[] clonedRegions = cloneHdfsRegions(regionsToAdd); metaChanges.setNewRegions(clonedRegions); } restoreWALs();     return metaChanges; }

这是RestoreSnapshotHelper的restoreHdfsRegions方法。 Set snapshotRegionNames = SnapshotReferenceUtil.getSnapshotRegionNames(fs, snapshotDir);遍历一下Snapshot目录下的region,没有region就退出了 首先要拿snapshot的region和现在的table的region逐个对比,分为三种情况: (1)以前没有的region,现在有的region,这个region是要删掉的

if (snapshotRegionNames.contains(regionName)){ snapshotRegionNames.remove(regionName); metaChanges.addRegionToRestore(regionInfo); }

(2)以前有,现在也有的region,这个region要被恢复

if (snapshotRegionNames.size() > 0) { List regionsToAdd = new LinkedList();

(3)以前有,现在没有了,这个region也要恢复,这个情况和前面的有点儿区别,要创建新的region目录和定义

HRegionInfo[] clonedRegions = cloneHdfsRegions(regionsToAdd); metaChanges.setNewRegions(clonedRegions);

接下来对region进行逐个恢复(restoreHdfRegions方法)

private void restoreRegion(HRegionInfo regionInfo) throws IOException { Path snapshotRegionDir = new Path(snapshotDir, regionInfo.getEncodedName()); Map> snapshotFiles = SnapshotReferenceUtil.getRegionHFileReferences(fs, snapshotRegionDir); Path regionDir = new Path(tableDir, regionInfo.getEncodedName()); String tableName = tableDesc.getTableName().getNameAsString(); for (Path familyDir: FSUtils.getFamilyDirs(fs, regionDir)) { byte[] family = Bytes.toBytes(familyDir.getName()); Set familyFiles = getTableRegionFamilyFiles(familyDir); List snapshotFamilyFiles = snapshotFiles.remove(familyDir.getName()); if (snapshotFamilyFiles != null) { List hfilesToAdd = new LinkedList(); for (String hfileName: snapshotFamilyFiles) { if (familyFiles.contains(hfileName)) { familyFiles.remove(hfileName); } else { hfilesToAdd.add(hfileName); } } for (String hfileName: familyFiles) { Path hfile = new Path(familyDir, hfileName); HFileArchiver.archiveStoreFile(conf, fs, regionInfo, tableDir, family, hfile); } for (String hfileName: hfilesToAdd) { restoreStoreFile(familyDir, regionInfo, hfileName); } } else { HFileArchiver.archiveFamily(fs, conf, regionInfo, tableDir, family); fs.delete(familyDir, true); } } for (Map.Entry> familyEntry: snapshotFiles.entrySet()) { Path familyDir = new Path(regionDir, familyEntry.getKey()); if (!fs.mkdirs(familyDir)) { throw new IOException("Unable to create familyDir=" + familyDir); } for (String hfileName: familyEntry.getValue()) { restoreStoreFile(familyDir, regionInfo, hfileName); } } }

首先先把hfile和列族挂钩,弄成一个>的map,一个一个列族去恢复,列族这块也存在上面region的3种情况 Map> snapshotFiles获得要恢复列表 for (Path familyDir: FSUtils.getFamilyDirs(fs, regionDir))恢复当前在表里面的列族

snapshot中的文件,现有的文件当中已经有的就留着,多了的删除,缺少的就要添加if (familyFiles.contains(hfileName))已经存在的hfile,从这里删除之后,后面就不用处理了familyFiles.remove(hfileName);缺少的hfile hfilesToAdd.add(hfileName); for (String hfileName: familyFiles)归档那些不在snapshot当中的hfile 缺少文件就添加 for (String hfileName: hfilesToAdd) { restoreStoreFile(familyDir, regionInfo, hfileName); }

在snapshot当中不存在,直接把这个列族的文件归档并删掉

HFileArchiver.archiveFamily(fs, conf, regionInfo, tableDir, family); fs.delete(familyDir, true);

添加不在当前表里的列族,然后恢复

for (Map.Entry> familyEntry: snapshotFiles.entrySet())

注意: (1)给hfile创建引用的时候,并未实际保存文件,而是创建了一个同名的空文件。在上面的情况当中,已经存在的同名的hfile,就不需要继续操作。因为hfile一旦写入到文件,writer关闭之后就不会修改了,即使是做compaction的时候,是把多个hfile合成一个新的hfile,把旧的文件删除来一个新的文件。 (2)对于那些后来新增的,在snapshot当前没有的文件,它们不是被直接删除,而是被移到了另外一个地方,归档的位置是archive目录,归档的操作是用HFileArchiver类来归档。碰到极端的情况,该文件已经存在了,就在文件后面加上".当前时间戳"。 (3)对于缺少的文件走的restoreStoreFile方法,下面是它的代码。

private void restoreStoreFile(final Path familyDir, final HRegionInfo regionInfo, final String hfileName) throws IOException { if (HFileLink.isHFileLink(hfileName)) { //是HFileLink的情况 HFileLink.createFromHFileLink(conf, fs, familyDir, hfileName); } else if (StoreFileInfo.isReference(hfileName)) { //是Reference的情况 restoreReferenceFile(familyDir, regionInfo, hfileName); } else { //是hfile的情况 HFileLink.create(conf, fs, familyDir, regionInfo, hfileName); } }

通过restoreWALs方法恢复日志

private void restoreWALs() throws IOException { final SnapshotLogSplitter logSplitter = new SnapshotLogSplitter(conf, fs, tableDir, snapshotTable, regionsMap); try { SnapshotReferenceUtil.visitRecoveredEdits(fs, snapshotDir, new FSVisitor.RecoveredEditsVisitor() { public void recoveredEdits (final String region, final String logfile) throws IOException { Path path = SnapshotReferenceUtil.getRecoveredEdits(snapshotDir, region, logfile); logSplitter.splitRecoveredEdit(path); } }); SnapshotReferenceUtil.visitLogFiles(fs, snapshotDir, new FSVisitor.LogFileVisitor() { public void logFile (final String server, final String logfile) throws IOException { logSplitter.splitLog(server, logfile); } }); } finally { logSplitter.close(); } }

Recover.Edits 遍历snapshot目录下的edits日志 SnapshotReferenceUtil.visitLogFiles(fs, snapshotDir, new FSVisitor.LogFileVisitor() 前面那个是基于region的日志,这个是基于Region Server的日志WALs日志 logSplitter.splitRecoveredEdit和logSplitter.splitLog的最后都调用了一个splitLog方法(editPath)的方法,区别的地方在于splitLog传了一个HLogLink splitLog方法

public void splitLog(final Path logPath) throws IOException { HLog.Reader log = HLogFactory.createReader(fs, logPath, conf); try { HLog.Entry entry; LogWriter writer = null; byte[] regionName = null; byte[] newRegionName = null; while ((entry = log.next()) != null) { HLogKey key = entry.getKey(); if (!key.getTablename().equals(snapshotTableName)) continue; if (!Bytes.equals(regionName, key.getEncodedRegionName())) { regionName = key.getEncodedRegionName().clone(); newRegionName = regionsMap.get(regionName); if (newRegionName == null) newRegionName = regionName; writer = getOrCreateWriter(newRegionName, key.getLogSeqNum()); } key = new HLogKey(newRegionName, tableName, key.getLogSeqNum(), key.getWriteTime(), key.getClusterIds()); writer.append(new HLog.Entry(key, entry.getEdit())); } } catch (IOException e) { LOG.warn("Something wrong during the log split", e); } finally { log.close(); } }

创建一个HLog.Reader读取日志文件,然后迭代,把属于我们要做snapshot的表的日志读取出来,它为每一个region的实例化一个Writer,调用的Writer的Append方法追加HLog。 最后,强制更新变化region的Region States为offline,并修改meta表中的region 对于删除的表,直接通过restoreHdfsRegions的方法进行解决。

三、总结

snapshot除了归档删除的情况外,备份和恢复大多数都是创建的链接文件,而不是采用直接大规模复制、替换HFile的方式。


1.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源;2.本站的原创文章,会注明原创字样,如未注明都非原创,如有侵权请联系删除!;3.作者投稿可能会经我们编辑修改或补充;4.本站不提供任何储存功能只提供收集或者投稿人的网盘链接。

标签: #2021SCSDUSC #IF #tableName