ModelScope中modelscope 未来有开发类似 huggingface-accelerate/ deepspeed 加速功能的框架计划吗,希望可以加速模型推断速度~
最近在测了seqgpt,发现modelscope框架的推理速度似乎比用huggingface慢很多
我试了下,batchsize=1的情况下执行速度基本是相同的from modelscope import pipeline as pipeline_ms, Model, Tasks, snapshot_download
from transformers import pipeline as pipeline_hf, AutoTokenizer
import time
inputs = {'task': '抽取', 'text': '杭州欢迎你。', 'labels': '地名'}
PROMPT_TEMPLATE = '输入: {text}\n{task}: {labels}\n输出: '
s = PROMPT_TEMPLATE.format(**inputs)
p_ms = pipeline_ms(task=Tasks.text_generation, model='damo/nlp_seqgpt-560m', device='cuda:0', model_revision = 'v1.0.1')
print(p_ms(s))
t1 = time.time()
for i in range(100):
p_ms(s)
print(f'ms time cost: {time.time()-t1}')
model_dir = snapshot_download('damo/nlp_seqgpt-560m')
model = Model.from_pretrained(model_dir)
tokenizer = AutoTokenizer.from_pretrained(model_dir, padding=True,
truncation=True,
max_length=1024)
p_hf= pipeline_hf("text-generation",model=model, tokenizer=tokenizer, device=0)
print(p_hf(s, num_beams=4, do_sample=False, max_new_tokens=256))
t1 = time.time()
for i in range(100):
p_hf(s, num_beams=4, do_sample=False, max_new_tokens=256)
print(f'hf time cost: {time.time()-t1}')目前这个模型在我们这里还不支持batchsize>1,预计未来的迭代会快速加上——此回答整理自钉群:魔搭ModelScope开发者联盟群 ①