报错FloatingPointError: Loss became infinite or NaN at iteration=88!

简介: 报错FloatingPointError: Loss became infinite or NaN at iteration=88!

项目场景:


Traceback (most recent call last):
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 255, in <module>
    args=(args,),
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
    main_func(*args)
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 235, in main
    return trainer.train()
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 118, in train
    self.train_loop(self.start_iter, self.max_iter)
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 107, in train_loop
    self.run_step()
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 232, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 245, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=88!
loss_dict = {'loss_fcos_cls': tensor(nan, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_loc': tensor(0.5552, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_ctr': tensor(0.7676, device='cuda:0', grad_fn=<DivBackward0>), 'loss_mask': tensor(0.8649, device='cuda:0', grad_fn=<DivBackward0>), 'data_time': 0.0022056670004531043}


20200805075812593.png


原因分析:


学习率的问题,导致损失爆炸了,可以把学习调整一下!

目录
相关文章
|
9天前
|
算法 NoSQL BI
variable precision SWAR算法
variable precision SWAR算法
|
2月前
|
机器学习/深度学习 算法 定位技术
神经网络epoch、batch、batch size、step与iteration的具体含义介绍
神经网络epoch、batch、batch size、step与iteration的具体含义介绍
157 1
成功解决ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
成功解决ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
|
11月前
|
机器学习/深度学习
Hinge Loss 和 Zero-One Loss
Hinge Loss 和 Zero-One Loss
96 0
一张图深入的理解FP/FN/Precision/Recall
一张图深入的理解FP/FN/Precision/Recall
125 0
|
机器学习/深度学习 算法框架/工具
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
|
PyTorch 算法框架/工具
Please ensure they have the same size. return F.mse_loss(input, target, reduction=self.reduction) 怎么解决?
这个通常是由于 input 和 target 张量的维度不匹配导致的,因此可以通过调整它们的维度来解决。
251 0
|
算法
variable-precision SWAR算法介绍
variable-precision SWAR算法介绍
230 0