modelscope-funasr这个问题是哪里的原因,如何排查和解决?
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/logging/init.py", line 1100, in emit
msg = self.format(record)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/logging/init.py", line 943, in format
return fmt.format(record)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/_logging/_internal.py", line 722, in format
artifact_name = getattr(logging.getLogger(record.name), "artifact_name", None)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/logging/init.py", line 2079, in getLogger
return Logger.manager.getLogger(name)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/logging/init.py", line 1333, in getLogger
if isinstance(rv, PlaceHolder):
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 76, in _terminate_process_handler
raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 2058071 got signal: 1
Call stack:
File "/home/ubuntu/miniconda3/envs/funASR_train/bin/torchrun", line 8, in
sys.exit(main())
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper
return f(args, *kwargs)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/distributed/run.py", line 879, in main
run(args)
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/distributed/run.py", line 870, in run
elastic_launch(
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/ubuntu/miniconda3/envs/funASR_train/lib/python3.10/site-packages/t