caffe 训练时,出现错误:Check failed: error == cudaSuccess (4 vs. 0) unspecified launch failure

简介: I0415 15:03:37.603461 27311 solver.cpp:42] Solver scaffolding done.I0415 15:03:37.603549 27311 solver.

I0415 15:03:37.603461 27311 solver.cpp:42] Solver scaffolding done.
I0415 15:03:37.603549 27311 solver.cpp:247] Solving AlexNet
I0415 15:03:37.603559 27311 solver.cpp:248] Learning Rate Policy: step
I0415 15:03:37.749981 27311 solver.cpp:214] Iteration 0, loss = 5.45141
I0415 15:03:37.750030 27311 solver.cpp:229]     Train net output #0: loss = 5.45141 (* 1 = 5.45141 loss)
I0415 15:03:37.750048 27311 solver.cpp:489] Iteration 0, lr = 0.001
I0415 15:03:38.316994 27311 solver.cpp:214] Iteration 12, loss = 4.23865
I0415 15:03:38.317054 27311 solver.cpp:229]     Train net output #0: loss = 4.23865 (* 1 = 4.23865 loss)
I0415 15:03:38.317068 27311 solver.cpp:489] Iteration 12, lr = 0.001
I0415 15:03:38.920938 27311 solver.cpp:214] Iteration 24, loss = 2.49914
I0415 15:03:38.921000 27311 solver.cpp:229]     Train net output #0: loss = 2.49914 (* 1 = 2.49914 loss)
I0415 15:03:38.921016 27311 solver.cpp:489] Iteration 24, lr = 0.001
I0415 15:03:39.509793 27311 solver.cpp:214] Iteration 36, loss = 3.76504
I0415 15:03:39.509850 27311 solver.cpp:229]     Train net output #0: loss = 3.76504 (* 1 = 3.76504 loss)
I0415 15:03:39.509861 27311 solver.cpp:489] Iteration 36, lr = 0.001
I0415 15:03:40.080806 27311 solver.cpp:214] Iteration 48, loss = 3.74901
I0415 15:03:40.080862 27311 solver.cpp:229]     Train net output #0: loss = 3.74901 (* 1 = 3.74901 loss)
I0415 15:03:40.080878 27311 solver.cpp:489] Iteration 48, lr = 0.001
I0415 15:03:40.643797 27311 solver.cpp:214] Iteration 60, loss = 2.27091
I0415 15:03:40.643849 27311 solver.cpp:229]     Train net output #0: loss = 2.27091 (* 1 = 2.27091 loss)
I0415 15:03:40.643860 27311 solver.cpp:489] Iteration 60, lr = 0.001
I0415 15:03:41.217475 27311 solver.cpp:214] Iteration 72, loss = 2.67078
I0415 15:03:41.217541 27311 solver.cpp:229]     Train net output #0: loss = 2.67078 (* 1 = 2.67078 loss)
I0415 15:03:41.217561 27311 solver.cpp:489] Iteration 72, lr = 0.001
I0415 15:03:41.793390 27311 solver.cpp:214] Iteration 84, loss = 1.77313
I0415 15:03:41.793452 27311 solver.cpp:229]     Train net output #0: loss = 1.77313 (* 1 = 1.77313 loss)
I0415 15:03:41.793468 27311 solver.cpp:489] Iteration 84, lr = 0.001
I0415 15:03:42.362951 27311 solver.cpp:214] Iteration 96, loss = 3.49406
I0415 15:03:42.363004 27311 solver.cpp:229]     Train net output #0: loss = 3.49406 (* 1 = 3.49406 loss)
I0415 15:03:42.363025 27311 solver.cpp:489] Iteration 96, lr = 0.001
I0415 15:03:42.946568 27311 solver.cpp:214] Iteration 108, loss = 2.81601
I0415 15:03:42.946633 27311 solver.cpp:229]     Train net output #0: loss = 2.81601 (* 1 = 2.81601 loss)
I0415 15:03:42.946651 27311 solver.cpp:489] Iteration 108, lr = 0.001
I0415 15:03:43.524155 27311 solver.cpp:214] Iteration 120, loss = 2.85056
I0415 15:03:43.524247 27311 solver.cpp:229]     Train net output #0: loss = 2.85056 (* 1 = 2.85056 loss)
I0415 15:03:43.524265 27311 solver.cpp:489] Iteration 120, lr = 0.001
I0415 15:03:44.100580 27311 solver.cpp:214] Iteration 132, loss = 3.58945
I0415 15:03:44.100646 27311 solver.cpp:229]     Train net output #0: loss = 3.58945 (* 1 = 3.58945 loss)
I0415 15:03:44.100661 27311 solver.cpp:489] Iteration 132, lr = 0.001
F0415 15:03:44.536542 27311 math_functions.cpp:91] Check failed: error == cudaSuccess (4 vs. 0)  unspecified launch failure
*** Check failure stack trace: ***
    @     0x7f01dbd9ddaa  (unknown)
    @     0x7f01dbd9dce4  (unknown)
    @     0x7f01dbd9d6e6  (unknown)
    @     0x7f01dbda0687  (unknown)
    @     0x7f01dc1bb3f5  caffe::caffe_copy<>()
    @     0x7f01dc230232  caffe::BasePrefetchingDataLayer<>::Forward_gpu()
    @     0x7f01dc1d9d6f  caffe::Net<>::ForwardFromTo()
    @     0x7f01dc1da197  caffe::Net<>::ForwardPrefilled()
    @     0x7f01dc20cbe5  caffe::Solver<>::Step()
    @     0x7f01dc20d52f  caffe::Solver<>::Solve()
    @           0x406428  train()
    @           0x404961  main
    @     0x7f01db2afec5  (unknown)
    @           0x404f0d  (unknown)
    @              (nil)  (unknown)
Aborted
wangxiao@gtx-980:~/Downloads/lstm_caffe_master$


---------------------------------------------------------------------------------------------------------------------------------------

怎么破 ???求解答。。。。

 

well, the only word I want to say is : where amazing happens ???

I restart my pc and run the code again and it worked ......

 

相关文章
|
IDE PyTorch 网络安全
|
缓存 PHP
Composer 报错 Error while processing content unencoding: Unknown failure within ...
Composer 报错 Error while processing content unencoding: Unknown failure within ...
204 0
|
3月前
|
并行计算 TensorFlow 算法框架/工具
【Deepin 20系统】解决Check failed: err == cudaSuccess || err == cudaErrorInvalidValue Unexpected CUDA erro
本文介绍了在使用Nvidia RTX 2070 GPU和TensorFlow 2时,解决GPU内存不足错误的方法,包括杀死占用内存的进程、重置GPU以及重启设备等方案。
57 3
verbose stack FetchError: request to https://registry.npm.taobao.org/md-editor-v3 failed, reason: ce
这篇文章描述了在安装npm包`md-editor-v3`时遇到的淘宝镜像证书过期问题,并提供了解决方案,即通过切换npm镜像源到`https://registry.npmmirror.com/`来解决安装失败的问题。
verbose stack FetchError: request to https://registry.npm.taobao.org/md-editor-v3 failed, reason: ce
|
Java
Appium问题解决方案(8)- selenium.common.exceptions.WebDriverException: Message: An unknown server-side error occurred while processing the command. Original error: Could not sign with default certificate.
Appium问题解决方案(8)- selenium.common.exceptions.WebDriverException: Message: An unknown server-side error occurred while processing the command. Original error: Could not sign with default certificate.
1072 0
Appium问题解决方案(8)- selenium.common.exceptions.WebDriverException: Message: An unknown server-side error occurred while processing the command. Original error: Could not sign with default certificate.
|
3月前
|
开发工具 git
GitHub——Error: Process completed with exit code 126.
GitHub——Error: Process completed with exit code 126.
51 1
GitHub——Error: Process completed with exit code 126.
|
6月前
|
应用服务中间件 Python 容器
ERROR [ntContainer#0-1] o.s.a.r.l.SimpleMessageListenerContainer 1917: Failed to check/redeclare aut
ERROR [ntContainer#0-1] o.s.a.r.l.SimpleMessageListenerContainer 1917: Failed to check/redeclare aut
210 0
|
6月前
|
机器学习/深度学习 Java Android开发
记录一个Flutter运行的异常FAILURE: Build failed with an exception. What went wrong: A problem occurred config
记录一个Flutter运行的异常FAILURE: Build failed with an exception. What went wrong: A problem occurred config
196 0
|
Unix Linux 异构计算
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See /var/log/nv
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See /var/log/nv
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See  /var/log/nv