RuntimeError: Address already in use

简介: RuntimeError: Address already in use

问题描述:Pytorch用多张GPU训练时,会报地址已被占用的错误。其实是端口号冲突了。


20201026092122255.png


因此解决方法要么kill原来的进程,要么修改端口号。


在代码里重新配置


torch.distributed.init_process_group()
    dist_init_method = 'tcp://{master_ip}:{master_port}'.format(master_ip='127.0.0.1', master_port='10000')
    dist_world_size = opt.world_size    #total number of distributed processes.
    torch.distributed.init_process_group(backend="nccl", init_method=dist_init_method, world_size=dist_world_size, rank=[0,1])


每次只要重新修改master_port

目录
相关文章
|
6天前
|
开发者 Python
【Python】已解决:TypeError: __init__() got an unexpected keyword argument ‘port’
【Python】已解决:TypeError: __init__() got an unexpected keyword argument ‘port’
17 0
【Python】已解决:TypeError: __init__() got an unexpected keyword argument ‘port’
加载模型出现-RuntimeError: Error(s) in loading state_dict for Net:unexpected key(s) in state_dict: XXX
加载模型出现-RuntimeError: Error(s) in loading state_dict for Net:unexpected key(s) in state_dict: XXX
458 0
|
2月前
|
Java Python
【已解决】RuntimeError Java gateway process exited before sending its port number
【已解决】RuntimeError Java gateway process exited before sending its port number
170 0
|
8月前
|
前端开发 安全
| ERROR: [2] bootstrap checks failed. You must address the points described in the following [2] lin
| ERROR: [2] bootstrap checks failed. You must address the points described in the following [2] lin
325 0
|
机器学习/深度学习 Windows
raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}‘.format( RuntimeError: Error(s)..报错
即load_state_dict(fsd,strict=False) 属性strict;当strict=True,要求预训练练权重层数的键值与新构建的模型中的权重层数名称完全吻合;
1350 0
|
并行计算 PyTorch 算法框架/工具
RuntimeError: CUDA error (10): invalid device ordinal
造成这个错误的原因主要是本地只有一个 GPU (GPU:0),而程序中使用 GPUs:1。
356 0
|
数据库连接 Go
[Golang] runtime error: invalid memory address or nil pointer dereferenc报错
[Golang] runtime error: invalid memory address or nil pointer dereferenc报错
成功解决absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'data_format'
成功解决absl.flags._exceptions.UnrecognizedFlagError: Unknown command line flag 'data_format'
|
并行计算 PyTorch 算法框架/工具
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
repeated call of attachBrowserEvent
Created by Jerry Wang, last modified on Jun 19, 2015
repeated call of attachBrowserEvent