在repvgg_model_convert()函数的for循环中,对上述打印的模型进行遍历,刚开始循环可以获得module为:
RepVGGBlock( (nonlinearity): ReLU() (se): Identity() (rbr_dense): Sequential( (conv): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) (rbr_1x1): Sequential( (conv): Conv2d(3, 48, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) )
由于上述的Block类中有switch_to_deploy(),调用该模块的该函数,然后进入switch_to_deploy()中:
def switch_to_deploy(self): if hasattr(self, 'rbr_reparam'): return kernel, bias = self.get_equivalent_kernel_bias() self.rbr_reparam = nn.Conv2d(in_channels=self.rbr_dense.conv.in_channels, out_channels=self.rbr_dense.conv.out_channels, kernel_size=self.rbr_dense.conv.kernel_size, stride=self.rbr_dense.conv.stride, padding=self.rbr_dense.conv.padding, dilation=self.rbr_dense.conv.dilation, groups=self.rbr_dense.conv.groups, bias=True) self.rbr_reparam.weight.data = kernel self.rbr_reparam.bias.data = bias for para in self.parameters(): para.detach_() self.__delattr__('rbr_dense') self.__delattr__('rbr_1x1') if hasattr(self, 'rbr_identity'): self.__delattr__('rbr_identity') if hasattr(self, 'id_tensor'): self.__delattr__('id_tensor') self.deploy = True
首先先判断有没有‘rbr_reparam’属性。【在对stage0的循环中没有该属性】,再进入get_equivalent_kernel_bias()函数,这里又会进入另一个函数fuse_bn_tensor【进行卷积层的融合】:
def get_equivalent_kernel_bias(self): kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense) # rbr_dense为主干3*3卷积 kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1) # rbr_1x1为 1X1卷积 kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity) return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid
def _fuse_bn_tensor(self, branch): if branch is None: return 0, 0 if isinstance(branch, nn.Sequential): kernel = branch.conv.weight # 获得卷积权值 running_mean = branch.bn.running_mean # 获得BN层均值 running_var = branch.bn.running_var # 获得BN层方差 gamma = branch.bn.weight # 获得BN层权值 beta = branch.bn.bias # 获得BN层偏置值 eps = branch.bn.eps else: assert isinstance(branch, nn.BatchNorm2d) if not hasattr(self, 'id_tensor'): input_dim = self.in_channels // self.groups kernel_value = np.zeros((self.in_channels, input_dim, 3, 3), dtype=np.float32) for i in range(self.in_channels): kernel_value[i, i % input_dim, 1, 1] = 1 self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device) kernel = self.id_tensor running_mean = branch.running_mean running_var = branch.running_var gamma = branch.weight beta = branch.bias eps = branch.eps std = (running_var + eps).sqrt() t = (gamma / std).reshape(-1, 1, 1, 1) return kernel * t, beta - running_mean * gamma / std
此时,上述代码中的branch是3X3的卷积层和BN层。通过断点运行也可以看出:
由于branch不为None,因此代码可以进行执行,分别获得卷积层和BN层的各自的权值。最后分别也对1X1卷积进行卷积和BN层的融合。然后对1X1卷积进行padding后与3X3卷积相加。
接着是在switch_to_deploy中的定义rbr_reparam,这是一个卷积,这个卷积的输入通道数为3X3卷积的输入通道数,输出通道数为3X3卷积的输出通道数,卷积核大小为3X3卷积核大小,步长、padding等均与原3X3卷积一样。【该卷积层是用来后面接受融合后的卷积参数的】
由于我们前面利用get_equivalent_kernel_bias()函数得到了融合后的权重和偏置值,因此可以将这些融合后参数传入我们前面定义的新卷积内:
self.rbr_reparam.weight.data = kernel self.rbr_reparam.bias.data = bias
然后对参数仅进行前向传播,阻断反向传播,并删除原模型中的分支,仅保留了我们定义的新卷积rbr_reparam。并将deploy置为True【刚开始为False】,置为True以后就用rbr_reparam这个新卷积代替原来的卷积分支了。
for para in self.parameters(): para.detach_() self.__delattr__('rbr_dense') self.__delattr__('rbr_1x1') if hasattr(self, 'rbr_identity'): self.__delattr__('rbr_identity') if hasattr(self, 'id_tensor'): self.__delattr__('id_tensor') self.deploy = True
可以看到stage0中的卷积结构已经发生了改变,变成了conv(3,48,3,2,1)
经过上述步骤不断的遍历,对卷积层进行融合,形成新的网络结构,最终结构如下,可以看到这里已经没有了原来训练时结构的分支。