开发者社区 > 大数据与机器学习 > 人工智能平台PAI > 正文

为什么 PAI DSW中一直无法使用GPU加速tensorflow,如何使用GPU加速.

并且会显示
Num GPUs Available: 0
Available GPUs: []

tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-05-26 15:03:10.885007: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-26 15:03:10.914951: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-26 15:03:10.914975: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-26 15:03:10.914996: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-26 15:03:10.920456: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-26 15:03:10.920647: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropr

展开
收起
aliyun8337357285-43269 2024-05-26 15:39:15 269 0
2 条回答
写回答
取消 提交回答
  • “Num GPUs Available: 0”,主要是因为环境未正确配置或缺少必要的CUDA和cuDNN驱动,导致系统无法识别或使用GPU。还有可能存在库冲突或重复注册的问题,检查是否有多余的库路径或环境变量设置,确保环境干净无误配置

    2024-05-27 10:08:02
    赞同 2 展开评论 打赏
  • 北京阿里云ACE会长

    cuDNN注册问题:错误信息Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered表明cuDNN工厂已经被注册,可能存在版本冲突或初始化问题。

    cuFFT和cuBLAS注册问题:类似的注册问题也出现在cuFFT和cuBLAS上,这可能与TensorFlow试图加载的GPU操作库有关。

    2024-05-27 08:04:31
    赞同 1 展开评论 打赏

人工智能平台 PAI(Platform for AI,原机器学习平台PAI)是面向开发者和企业的机器学习/深度学习工程平台,提供包含数据标注、模型构建、模型训练、模型部署、推理优化在内的AI开发全链路服务,内置140+种优化算法,具备丰富的行业场景插件,为用户提供低门槛、高性能的云原生AI工程化能力。

相关产品

  • 人工智能平台 PAI
  • 热门讨论

    热门文章

    相关电子书

    更多
    大规模机器学习在蚂蚁+阿里的应用 立即下载
    阿里巴巴机器学习平台AI 立即下载
    微博机器学习平台架构和实践 立即下载