开发者社区 > ModelScope模型即服务 > 正文

notebook GPU模式 chatglm2-6b都跑不了吗?

notebook GPU模式,首次运行实例代码“介绍清华大学”,能成功返回,之后写了个python,让批量生成,一跑就出错,提示GPU内存不够。

Traceback (most recent call last):
File "glm.py", line 63, in
main()
File "glm.py", line 60, in main
generate_and_save_articles(model, input_file, output_dir)
File "glm.py", line 23, in generate_and_save_articles
article = generate_article(model, keyword)
File "glm.py", line 9, in generate_article
result = pipe(inputs)
File "/opt/conda/lib/python3.8/site-packages/modelscope/pipelines/base.py", line 219, in call
output = self._process_single(input, args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/modelscope/pipelines/base.py", line 254, in _process_single
out = self.forward(out, forward_params)
File "/opt/conda/lib/python3.8/site-packages/modelscope/pipelines/nlp/text_generation_pipeline.py", line 274, in forward
return self.model.chat(inputs, self.tokenizer)
File "/opt/conda/lib/python3.8/site-packages/modelscope/models/nlp/chatglm2/text_generation.py", line 1432, in chat
response, history = self._chat(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/modelscope/models/nlp/chatglm2/text_generation.py", line 1204, in _chat
outputs = self.generate(inputs, gen_kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 1572, in generate
return self.sample(
File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 2619, in sample
outputs = self(
File "/opt/conda/lib/python3.8/site-packages/modelscope/models/base/base_torch_model.py", line 36, in call
return self.postprocess(self.forward(args, **kwargs))
File "/opt/conda/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/modelscope/models/nlp/chatglm2/text_generation.py", line 1094, in forward
lm_logits = self.transformer.output_layer(hidden_states)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(
args, kwargs)
File "/opt/conda/lib/python3.8/site-packages/accelerate/hooks.py", line 160, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, args, *kwargs)
File "/opt/conda/lib/python3.8/site-packages/accelerate/hooks.py", line 286, in pre_forward
set_module_tensor_to_device(
File "/opt/conda/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 298, in set_module_tensor_to_device
new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 508.00 MiB (GPU 0; 15.90 GiB total capacity; 2.04 GiB already allocated; 494.81 MiB free; 2.05 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

展开
收起
游客5zsydrlr4jitk 2023-08-10 10:26:01 127 0
1 条回答
写回答
取消 提交回答
  • chatglm2-6b 是一个比较大的模型,需要比较大的内存。如果您使用 notebook GPU 模式,可能无法运行 chatglm2-6b。

    您可以尝试以下方法:

    使用更大的 GPU。
    在本地环境中运行 chatglm2-6b。
    使用更少的 epochs。
    使用更小的 batch size。
    如果您仍然无法运行 chatglm2-6b,您可以尝试联系 modelscope 社区寻求帮助。

    2023-09-25 16:55:20
    赞同 展开评论 打赏

ModelScope旨在打造下一代开源的模型即服务共享平台,为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品,让模型应用更简单!欢迎加入技术交流群:微信公众号:魔搭ModelScope社区,钉钉群号:44837352

相关电子书

更多
DeepStream: GPU加速海量视频数据智能处理 立即下载
阿里巴巴高性能GPU架构与应用 立即下载
GPU在超大规模深度学习中的发展和应用 立即下载