项目场景:
Traceback (most recent call last): File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 255, in <module> args=(args,), File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 235, in main return trainer.train() File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 118, in train self.train_loop(self.start_iter, self.max_iter) File "/home/yuan/桌面/shenchunhua/CondInst-master/train_net.py", line 107, in train_loop self.run_step() File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 232, in run_step self._detect_anomaly(losses, loss_dict) File "/home/yuan/anaconda3/envs/AdelaiNet/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 245, in _detect_anomaly self.iter, loss_dict FloatingPointError: Loss became infinite or NaN at iteration=88! loss_dict = {'loss_fcos_cls': tensor(nan, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_loc': tensor(0.5552, device='cuda:0', grad_fn=<DivBackward0>), 'loss_fcos_ctr': tensor(0.7676, device='cuda:0', grad_fn=<DivBackward0>), 'loss_mask': tensor(0.8649, device='cuda:0', grad_fn=<DivBackward0>), 'data_time': 0.0022056670004531043}
原因分析:
学习率的问题,导致损失爆炸了,可以把学习调整一下!