报错FloatingPointError: Loss became infinite or NaN at iteration=88!

简介: 报错FloatingPointError: Loss became infinite or NaN at iteration=88!

项目场景:


Traceback (most recent call last):
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 255, in <module>
    args=(args,),
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
    main_func(*args)
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 235, in main
    return trainer.train()
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 118, in train
    self.train_loop(self.start_iter, self.max_iter)
  File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 107, in train_loop
    self.run_step()
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 232, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 245, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=88!
loss_dict = {'loss_fcos_cls': tensor(nan, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_loc': tensor(0.5552, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_ctr': tensor(0.7676, device='cuda:0', grad_fn=<DivBackward0>), 'loss_mask': tensor(0.8649, device='cuda:0', grad_fn=<DivBackward0>), 'data_time': 0.0022056670004531043}


20200805075812593.png


原因分析:


学习率的问题,导致损失爆炸了,可以把学习调整一下!

目录
相关文章
|
2月前
|
TensorFlow 算法框架/工具 Python
|
7月前
|
机器学习/深度学习
Epoch、Batch 和 Iteration 的区别详解
【8月更文挑战第23天】
942 0
|
5月前
|
机器学习/深度学习 索引 Python
Numpy学习笔记(二):argmax参数中axis=0,axis=1,axis=-1详解附代码
本文解释了NumPy中`argmax`函数的`axis`参数在不同维度数组中的应用,并通过代码示例展示了如何使用`axis=0`、`axis=1`和`axis=-1`来找到数组中最大值的索引。
326 0
Numpy学习笔记(二):argmax参数中axis=0,axis=1,axis=-1详解附代码
成功解决ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
成功解决ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
|
机器学习/深度学习 算法框架/工具
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
【问题记录与解决】KeyError: ‘acc‘ plt.plot(N[150:], H.history[“acc“][150:], label=“train_acc“) # KeyError: ‘
|
存储
LeetCode 329. Longest Increasing Path in a Matrix
给定一个整数矩阵,找出最长递增路径的长度。 对于每个单元格,你可以往上,下,左,右四个方向移动。 你不能在对角线方向上移动或移动到边界外(即不允许环绕)。
88 0
LeetCode 329. Longest Increasing Path in a Matrix
成功解决ValueError: Dimension 1 in both shapes must be equal, but are 1034 and 1024. Shapes are [100,103
成功解决ValueError: Dimension 1 in both shapes must be equal, but are 1034 and 1024. Shapes are [100,103
HDOJ 2131 Probability
HDOJ 2131 Probability
106 0
Can not squeeze dim[1], expected a dimension of 1, got 21
Can not squeeze dim[1], expected a dimension of 1, got 21
507 0
|
算法 C#
算法题丨3Sum Closest
描述 Given an array S of n integers, find three integers in S such that the sum is closest to a given number, target.
1307 0