tensorflow-gpu2.1.0报错 so returning NUMA node zero解决办法

简介: tensorflow-gpu2.1.0报错 so returning NUMA node zero解决办法
>>> print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
2020-06-06 10:14:08.927485: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-06 10:14:08.950893: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2793605000 Hz
2020-06-06 10:14:08.951424: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x562e7913f720 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-06 10:14:08.951449: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-06-06 10:14:08.953797: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-06 10:14:09.223937: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.224406: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x562e792142e0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-06-06 10:14:09.224427: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 950M, Compute Capability 5.0
2020-06-06 10:14:09.224580: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.224939: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 950M computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 26.82GiB/s
2020-06-06 10:14:09.225192: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-06 10:14:09.227247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-06 10:14:09.228516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-06 10:14:09.228872: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-06 10:14:09.230221: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-06 10:14:09.231062: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-06 10:14:09.233700: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-06 10:14:09.233878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.234374: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.234811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-06-06 10:14:09.263649: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-06 10:14:09.286790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-06 10:14:09.287059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-06-06 10:14:09.287107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-06-06 10:14:09.303423: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.303947: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-06 10:14:09.304356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 3708 MB memory) -> physical GPU (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0, compute capability: 5.0)
Default GPU Device: /device:GPU:0


解决办法:


在代码中添加一下两行


import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'


相关实践学习
在云上部署ChatGLM2-6B大模型(GPU版)
ChatGLM2-6B是由智谱AI及清华KEG实验室于2023年6月发布的中英双语对话开源大模型。通过本实验,可以学习如何配置AIGC开发环境,如何部署ChatGLM2-6B大模型。
相关文章
|
数据库
【YashanDB知识库】如何解决共享集群部署遇到报错:YAS-05721 invalid input parameter, reason: node name invalid.
【YashanDB知识库】如何解决共享集群部署遇到报错:YAS-05721 invalid input parameter, reason: node name invalid.
|
数据库
【YashanDB 知识库】如何解决共享集群部署遇到报错:YAS-05721 invalid input parameter, reason: node name invalid.
**问题现象**:在共享集群部署数据库时,遇到错误 YAS-05721,提示节点名称无效。原因是操作系统主机名不符合服务器名称要求(字母、数字、下划线组成,长度4-64字符,以字母开头)。**解决办法**:1. 部署时加--ignore-hostname 参数,由 yasboot 自动生成合法名称;2. 修改操作系统 hostname 符合要求。
|
数据库
【YashanDB 知识库】如何解决共享集群部署遇到报错:YAS-05721 invalid input parameter, reason: node name invalid.
在共享集群部署数据库时,遇到错误“YAS-05721 invalid input parameter, reason: node name invalid”。原因是操作系统的主机名不符合服务器名称要求(需由字母、数字、下划线组成,以字母开头,长度4-64字符)。解决办法:1. 部署时加--ignore-hostname参数,由yasboot生成合规名称;2. 修改操作系统hostname以符合要求。
|
JavaScript
node环境之Error: Cannot find module ‘chalk’ 报错无法解决的问题—-网上说让你npm install chalk 基本是没有用的-优雅草央千澈解决方案
node环境之Error: Cannot find module ‘chalk’ 报错无法解决的问题—-网上说让你npm install chalk 基本是没有用的-优雅草央千澈解决方案
1272 13
node环境之Error: Cannot find module ‘chalk’ 报错无法解决的问题—-网上说让你npm install chalk 基本是没有用的-优雅草央千澈解决方案
|
JavaScript 开发工具 git
已安装nodejs但是安装hexo报错
已安装nodejs但是安装hexo报错
340 2
|
TensorFlow 算法框架/工具 异构计算
解决No registered ‘MultiDeviceIteratorGetNextFromShard‘ OpKernel for GPU devices compatible with node
在使用TensorFlow 1.15版本进行多GPU分布式训练时遇到的"No registered 'MultiDeviceIteratorGetNextFromShard' OpKernel for GPU devices"错误,并提供了通过降级TensorFlow到1.14.0版本来解决此问题的方法。
249 1
|
机器学习/深度学习 TensorFlow 算法框架/工具
【Tensorflow+keras】解决cuDNN launch failure : input shape ([32,2,8,8]) [[{{node sequential_1/batch_nor
在使用TensorFlow 2.0和Keras训练生成对抗网络(GAN)时,遇到了“cuDNN launch failure”错误,特别是在调用self.generator.predict方法时出现,输入形状为([32,2,8,8])。此问题可能源于输入数据形状与模型期望的形状不匹配或cuDNN版本不兼容。解决方案包括设置GPU内存增长、检查模型定义和输入数据形状、以及确保TensorFlow和cuDNN版本兼容。
348 1
|
TensorFlow 算法框架/工具 Python
【Mac 系统】解决VSCode用Conda成功安装TensorFlow但程序报错显示红色波浪线Unable to import ‘tensorflow‘ pylint(import-error)
本文解决在Mac系统上使用VSCode时遇到的TensorFlow无法导入问题,原因是Python解析器未正确设置为Conda环境下的版本。通过在VSCode左下角选择正确的Python解析器,即可解决import TensorFlow时报错和显示红色波浪线的问题。
1048 9
|
监控 Serverless 异构计算
函数计算操作报错合集之GPU服务请求返回了404错误是什么原因
在使用函数计算服务(如阿里云函数计算)时,用户可能会遇到多种错误场景。以下是一些常见的操作报错及其可能的原因和解决方法,包括但不限于:1. 函数部署失败、2. 函数执行超时、3. 资源不足错误、4. 权限与访问错误、5. 依赖问题、6. 网络配置错误、7. 触发器配置错误、8. 日志与监控问题。
302 1
|
移动开发 运维 JavaScript
阿里云云效操作报错合集之遇到Node.js的内存溢出问题,该怎么办
本合集将整理呈现用户在使用过程中遇到的报错及其对应的解决办法,包括但不限于账户权限设置错误、项目配置不正确、代码提交冲突、构建任务执行失败、测试环境异常、需求流转阻塞等问题。阿里云云效是一站式企业级研发协同和DevOps平台,为企业提供从需求规划、开发、测试、发布到运维、运营的全流程端到端服务和工具支撑,致力于提升企业的研发效能和创新能力。

热门文章

最新文章