代码如下:
当我们调用.backward()时,所发生的过程可以通过前面的动画可视化。
现在我们计算了梯度,我们可以可视化并绘制它们:
由于网络还没有经过训练,所以上面的梯度看起来像随机噪声……但是,一旦我们对网络进行训练,梯度的信息会更丰富:
通过回调实现自动化
这是一个非常有用的工具,帮助阐明在你的网络训练中发生了什么。在这种情况下,我们想要自动化这个过程,这样它就会在训练中自动发生。
为此,我们将使用PyTorch Lightning来实现我们的神经网络:
importtorchimporttorch.nn.functionalasFimportpytorch_lightningasplclassLitClassifier(pl.LightningModule): def__init__(self): super().__init__() self.l1=torch.nn.Linear(28*28, 10) defforward(self, x): returntorch.relu(self.l1(x.view(x.size(0), -1))) deftraining_step(self, batch, batch_idx): x, y=batchy_hat=self(x) loss=F.cross_entropy(y_hat, y) result=pl.TrainResult(loss) #enabletheautoconfusedlogitcallbackself.last_batch=batchself.last_logits=y_hat.detach() result.log('train_loss', loss, on_epoch=True) returnresultdefvalidation_step(self, batch, batch_idx): x, y=batchy_hat=self(x) loss=F.cross_entropy(y_hat, y) result=pl.EvalResult(checkpoint_on=loss) result.log('val_loss', loss) returnresultdefconfigure_optimizers(self): returntorch.optim.Adam(self.parameters(), lr=0.005)
可以将自动绘制出此处描述内容的复杂代码,抽象为Lightning中的Callback。Callback回调是一个小程序,您可能会在训练的各个部分调用它。
在本例中,当处理训练批处理时,我们希望生成这些图像,以防某些输入出现混乱。。
importtorchfrompytorch_lightningimportCallbackfromtorchimportnnclassConfusedLogitCallback(Callback): def__init__( self, top_k, projection_factor=3, min_logit_value=5.0, logging_batch_interval=20, max_logit_difference=0.1 ): super().__init__() self.top_k=top_kself.projection_factor=projection_factorself.max_logit_difference=max_logit_differenceself.logging_batch_interval=logging_batch_intervalself.min_logit_value=min_logit_valuedefon_train_batch_end(self, trainer, pl_module, batch, batch_idx, dataloader_idx): #showimagesonlyevery20batchesif (trainer.batch_idx+1) %self.logging_batch_interval!=0: return#pickthelastbatchandlogitsx, y=batchtry: logits=pl_module.last_logitsexceptAttributeErrorase: m="""please track the last_logits in the training_step like so:def training_step(...):self.last_logits = your_logits"""raiseAttributeError(m) #onlycheckwhenithasopinions (ie: thelogit>5) iflogits.max() >self.min_logit_value: #pickthetoptwoconfusedprobs (values, idxs) =torch.topk(logits, k=2, dim=1) #careaboutonlytheonesthatareatmostepsclosetoeachothereps=self.max_logit_differencemask= (values[:, 0] -values[:, 1]).abs() <epsifmask.sum() >0: #pullouttheoneswecareaboutconfusing_x=x[mask, ...] confusing_y=y[mask] mask_idxs=idxs[mask] pl_module.eval() self._plot(confusing_x, confusing_y, trainer, pl_module, mask_idxs) pl_module.train() def_plot(self, confusing_x, confusing_y, trainer, model, mask_idxs): frommatplotlibimportpyplotaspltconfusing_x=confusing_x[:self.top_k] confusing_y=confusing_y[:self.top_k] x_param_a=nn.Parameter(confusing_x) x_param_b=nn.Parameter(confusing_x) batch_size, c, w, h=confusing_x.size() forlogit_i, x_paraminenumerate((x_param_a, x_param_b)): x_param=x_param.to(model.device) logits=model(x_param.view(batch_size, -1)) logits[:, mask_idxs[:, logit_i]].sum().backward() #reshapegradsgrad_a=x_param_a.grad.view(batch_size, w, h) grad_b=x_param_b.grad.view(batch_size, w, h) forimg_iinrange(len(confusing_x)): x=confusing_x[img_i].squeeze(0).cpu() y=confusing_y[img_i].cpu() ga=grad_a[img_i].cpu() gb=grad_b[img_i].cpu() mask_idx=mask_idxs[img_i].cpu() fig, axarr=plt.subplots(nrows=2, ncols=3, figsize=(15, 10)) self.__draw_sample(fig, axarr, 0, 0, x, f'True: {y}') self.__draw_sample(fig, axarr, 0, 1, ga, f'd{mask_idx[0]}-logit/dx') self.__draw_sample(fig, axarr, 0, 2, gb, f'd{mask_idx[1]}-logit/dx') self.__draw_sample(fig, axarr, 1, 1, ga*2+x, f'd{mask_idx[0]}-logit/dx') self.__draw_sample(fig, axarr, 1, 2, gb*2+x, f'd{mask_idx[1]}-logit/dx') trainer.logger.experiment.add_figure('confusing_imgs', fig, global_step=trainer.global_step) def__draw_sample(fig, axarr, row_idx, col_idx, img, title): im=axarr[row_idx, col_idx].imshow(img) fig.colorbar(im, ax=axarr[row_idx, col_idx]) axarr[row_idx, col_idx].set_title(title, fontsize=20)
但是,通过安装pytorch-lightning-bolts,我们让它变得更容易了
!pipinstallpytorch-lightning-boltsfrompl_bolts.callbacks.visionimportConfusedLogitCallbacktrainer=Trainer(callbacks=[ConfusedLogitCallback(1)])
把它们放在一起
最后,我们可以训练我们的模型,并在判断逻辑产生混乱时自动生成图像。
#datadataset=MNIST(os.getcwd(), download=True, transform=transforms.ToTensor()) train, val=random_split(dataset, [55000, 5000]) #modelmodel=LitClassifier() #attachcallbacktrainer=Trainer(callbacks=[ConfusedLogitCallback(1)]) #train!trainer.fit(model, DataLoader(train, batch_size=64), DataLoader(val, batch_size=64))
tensorboard会自动生成如下图片:
看看这个是不是变得不一样了