运行模型时报错 'megatron_util.mpu' has no attribute 'get_

运行ZhipuAI/Multilingual-GLM-Summarization-zh的官方代码范例时，报错AttributeError: MGLMTextSummarizationPipeline: module 'megatron_util.mpu' has no attribute 'get_model_parallel_rank'
环境是基于ModelScope官方docker镜像，尝试了各个版本结果都是一样的。也尝试使用
pip uninstall megatron_util && pip install megatron_util -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html重新安装各个版本的megatron_util但是还是没有效果。
具体报错内容如下：

root@dc19cfd686f9:/app# python test.py 
2023-11-19 03:45:04,162 - modelscope - INFO - PyTorch version 1.11.0+cu113 Found.
2023-11-19 03:45:04,163 - modelscope - INFO - Loading ast index from /mnt/workspace/.cache/modelscope/ast_indexer
2023-11-19 03:45:04,181 - modelscope - INFO - Loading done! Current index file version is 1.6.1, with md5 7ef0ea14f2ab92de0a689c4b4f9dd76d and a total number of 849 components indexed
2023-11-19 03:45:05,183 - modelscope - INFO - Model revision not specified, use the latest revision: v1.0.1
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
[2023-11-19 03:45:06,171] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
WARNING: No training data specified
using world size: 1 and model-parallel size: 1 
 > using dynamic loss scaling
megatron initialized twice
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /opt/conda/lib/python3.7/site-packages/modelscope/utils/registry.py:212 in build_from_cfg        │
│                                                                                                  │
│   209 │   │   if hasattr(obj_cls, '_instantiate'):                                               │
│   210 │   │   │   return obj_cls._instantiate(**args)                                            │
│   211 │   │   else:                                                                              │
│ ❱ 212 │   │   │   return obj_cls(**args)                                                         │
│   213 │   except Exception as e:                                                                 │
│   214 │   │   # Normal TypeError does not print class name.                                      │
│   215 │   │   raise type(e)(f'{obj_cls.__name__}: {e}')                                          │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/pipelines/nlp/mglm_text_summarization_pipeline │
│ .py:29 in __init__                                                                               │
│                                                                                                  │
│   26 │   │   │   │    *args,                                                                     │
│   27 │   │   │   │    **kwargs):                                                                 │
│   28 │   │   model = MGLMForTextSummarization(model) if isinstance(model,                        │
│ ❱ 29 │   │   │   │   │   │   │   │   │   │   │   │   │   │   │     str) else model               │
│   30 │   │   self.model = model                                                                  │
│   31 │   │   self.model.eval()                                                                   │
│   32 │   │   if preprocessor is None:                                                            │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/models/nlp/mglm/mglm_for_text_summarization.py │
│ :371 in __init__                                                                                 │
│                                                                                                  │
│   368 │   │   # setting default batch size to 1                                                  │
│   369 │   │   self.args.batch_size = 1                                                           │
│   370 │   │   self.args.tokenizer_path = model_dir                                               │
│ ❱ 371 │   │   self.tokenizer = prepare_tokenizer(self.args)                                      │
│   372 │   │   self.model = setup_model(self.args)                                                │
│   373 │   │   self.cfg = Config.from_file(                                                       │
│   374 │   │   │   osp.join(model_dir, ModelFile.CONFIGURATION))                                  │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/models/nlp/mglm/configure_data.py:147 in       │
│ prepare_tokenizer                                                                                │
│                                                                                                  │
│   144 │   │   add_task_mask=args.task_mask,                                                      │
│   145 │   │   add_decoder_mask=args.block_mask_prob > 0.0                                        │
│   146 │   │   or args.context_mask_ratio > 0.0)                                                  │
│ ❱ 147 │   if mpu.get_model_parallel_rank() == 0:                                                 │
│   148 │   │   num_tokens = tokenizer.num_tokens                                                  │
│   149 │   │   eod_token = tokenizer.get_command('eos').Id                                        │
│   150 │   │   assert eod_token == tokenizer.get_command('pad').Id                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: module 'megatron_util.mpu' has no attribute 'get_model_parallel_rank'

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /app/test.py:11 in <module>                                                                      │
│                                                                                                  │
│    8 pipe = pipeline(                                                                            │
│    9 │   task=Tasks.text_summarization,                                                          │
│   10 │   model=model,                                                                            │
│ ❱ 11 │   preprocessor=preprocessor,                                                              │
│   12 )                                                                                           │
│   13 result = pipe(                                                                              │
│   14 │   '据中国载人航天工程办公室消息，北京时间2022年10月25日，梦天实验舱与长征五号B遥四运载    │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/pipelines/builder.py:140 in pipeline           │
│                                                                                                  │
│   137 │   if preprocessor is not None:                                                           │
│   138 │   │   cfg.preprocessor = preprocessor                                                    │
│   139 │                                                                                          │
│ ❱ 140 │   return build_pipeline(cfg, task_name=task)                                             │
│   141                                                                                            │
│   142                                                                                            │
│   143 def add_default_pipeline_info(task: str,                                                   │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/pipelines/builder.py:57 in build_pipeline      │
│                                                                                                  │
│    54 │   │   default_args (dict, optional): Default initialization arguments.                   │
│    55 │   """                                                                                    │
│    56 │   return build_from_cfg(                                                                 │
│ ❱  57 │   │   cfg, PIPELINES, group_key=task_name, default_args=default_args)                    │
│    58                                                                                            │
│    59                                                                                            │
│    60 def pipeline(task: str = None,                                                             │
│                                                                                                  │
│ /opt/conda/lib/python3.7/site-packages/modelscope/utils/registry.py:215 in build_from_cfg        │
│                                                                                                  │
│   212 │   │   │   return obj_cls(**args)                                                         │
│   213 │   except Exception as e:                                                                 │
│   214 │   │   # Normal TypeError does not print class name.                                      │
│ ❱ 215 │   │   raise type(e)(f'{obj_cls.__name__}: {e}')                                          │
│   216                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: MGLMTextSummarizationPipeline: module 'megatron_util.mpu' has no attribute 'get_model_parallel_rank'

根据您提供的错误信息，问题出在megatron_util.mpu模块中没有找到get_model_parallel_rank属性。这可能是因为megatron_util库的版本不兼容导致的。

您可以尝试以下方法解决这个问题：

卸载当前的megatron_util库：

pip uninstall megatron_util -y

确保您的环境变量PYTHONPATH包含了正确的megatron_util库路径。您可以通过以下命令查看PYTHONPATH：

echo $PYTHONPATH

如果PYTHONPATH没有包含正确的路径，您可以通过以下命令将其添加到PYTHONPATH：

export PYTHONPATH=$PYTHONPATH:/path/to/megatron_util

注意将/path/to/megatron_util替换为实际的megatron_util库路径。

重新安装megatron_util库：

pip install megatron_util -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html

完成以上步骤后，重新运行您的代码，问题应该得到解决。

运行模型时报错 'megatron_util.mpu' has no attribute 'get_

自然语言处理

相关文章

相关解决方案

热门讨论

热门文章