RuntimeError: Address already in use

简介: RuntimeError: Address already in use

问题描述:Pytorch用多张GPU训练时,会报地址已被占用的错误。其实是端口号冲突了。


20201026092122255.png


因此解决方法要么kill原来的进程,要么修改端口号。


在代码里重新配置


torch.distributed.init_process_group()
    dist_init_method = 'tcp://{master_ip}:{master_port}'.format(master_ip='127.0.0.1', master_port='10000')
    dist_world_size = opt.world_size    #total number of distributed processes.
    torch.distributed.init_process_group(backend="nccl", init_method=dist_init_method, world_size=dist_world_size, rank=[0,1])


每次只要重新修改master_port

目录
相关文章
|
9月前
|
Java Maven Android开发
成功解决FATAL ERROR in native method: JDWP on getting class status, jvmtiError=JVMTI_ERROR_WRONG_PHASE
成功解决FATAL ERROR in native method: JDWP on getting class status, jvmtiError=JVMTI_ERROR_WRONG_PHASE
|
机器学习/深度学习 Windows
raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}‘.format( RuntimeError: Error(s)..报错
即load_state_dict(fsd,strict=False) 属性strict;当strict=True,要求预训练练权重层数的键值与新构建的模型中的权重层数名称完全吻合;
1215 0
|
并行计算 PyTorch 算法框架/工具
RuntimeError: CUDA error (10): invalid device ordinal
造成这个错误的原因主要是本地只有一个 GPU (GPU:0),而程序中使用 GPUs:1。
300 0
|
C语言
error: implicit declaration of function ‘VerifyFixClassname‘ is invalid in C99 [-Werror,-Wimplicit-f
error: implicit declaration of function ‘VerifyFixClassname‘ is invalid in C99 [-Werror,-Wimplicit-f
119 0
|
资源调度 JavaScript
The futex facility returned an unexpected error code
The futex facility returned an unexpected error code
618 0
|
Python
SyntaxError: Missing parentheses in call to 'print'
SyntaxError: Missing parentheses in call to 'print'
107 0
|
并行计算 PyTorch 算法框架/工具
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
repeated call of attachBrowserEvent
Created by Jerry Wang, last modified on Jun 19, 2015
repeated call of attachBrowserEvent