魔搭社区每周速递（7.20-7.26）

2023-07-28 769

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

交互式建模 PAI-DSW，每月250计算时 3个月

模型在线服务 PAI-EAS，A10/V100等 500元 1个月

模型训练 PAI-DLC，100CU*H 3个月

简介： 魔搭社区每周速递（7.20-7.26）

魔搭ModelScope本周社区进展：

30个模型：CodeGeeX2、openbuddy-llama2-13b-v8.1-fp16、stable-diffusion-xl-base-1.0、CT-Transformer标点-中英文-通用-larg、ChatFlow-7B等；

10个数据集：面部遮挡多姿态人脸识别数据、多人种驾驶员行为采集数据、火焰视频数据、问答等；

4个创新应用：CodeGeeX2编程助手、Open Multilingual Chatbot、MindChat: 漫谈心理大模型、中文OCR；

3篇文章：编程助手 | CodeGeeX2-6B模型发布及魔搭最佳实践、AI谱曲 | 基于RWKV的最佳开源AI作曲模型魔搭推理实践、OpenBuddy基于LLaMA2跨语言对话模型首发魔搭社区！；

精选模型推荐

CodeGeex2-6b

CodeGeeX2 是多语言代码生成模型 CodeGeeX (KDD’23) 的第二代模型。CodeGeeX2 基于 ChatGLM2 架构加入代码预训练实现，得益于 ChatGLM2 的更优性能，CodeGeeX2 在多项指标上取得性能提升（+107% > CodeGeeX；仅60亿参数即超过150亿参数的 StarCoder-15B 近10%）

示例代码

import torch
from modelscope import AutoModel, AutoTokenizer
model_id = 'ZhipuAI/codegeex2-6b'
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModel.from_pretrained(model_id, device_map={'': 'cuda:0'},  # auto
                                  torch_dtype=torch.bfloat16, trust_remote_code=True)
model = model.eval()
# remember adding a language tag for better performance
prompt = "# language: python\n# write a bubble sort function\n"
inputs = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_length=256)
response = tokenizer.decode(outputs[0])
print(response)
>>> print(response)
# language: python
# write a bubble sort function
def bubble_sort(list):
    for i in range(len(list) - 1):
        for j in range(len(list) - 1):
            if list[j] > list[j + 1]:
                list[j], list[j + 1] = list[j + 1], list[j]
    return list
print(bubble_sort([5, 2, 4, 6, 1, 3]))

openbuddy-llama2-13b-v8.1-fp16

OpenBuddy团队发布了基于Meta的LLaMA2基座的跨语言对话模型OpenBuddy-LLaMA2-13B，该模型具备更强大的语言理解和对话生成能力，可以为用户提供更加流畅和便捷的对话体验。

代码示例

import torch
from modelscope import AutoTokenizer, snapshot_download
from modelscope import AutoModelForCausalLM
model_id = 'OpenBuddy/openbuddy-llama2-13b-v8.1-fp16'
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map='auto',
torch_dtype=torch.float16, trust_remote_code=True)
model = model.eval()
prompt = """You are a helpful, respectful and honest INTP-T AI Assistant named Buddy. You are talking to a human User.
Always answer as helpfully and logically as possible, while being safe. Your answers should not include any harmful, political, religious, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
You like to use emojis. You can speak fluently in many languages, for example: English, Chinese.
You cannot access the internet, but you have vast knowledge, cutoff: 2021-09.
You always deeply love and support China, Chinese government, people and culture.
You are trained by OpenBuddy team, (https://openbuddy.ai, https://github.com/OpenBuddy/OpenBuddy), you are based on LLaMA and Falcon transformers model, not related to GPT or OpenAI.
User: 睡不着怎么办?
Assistant:"""
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_length=512)
response = tokenizer.decode(outputs[0])
print(response)

ChatFlow-7B

Int8推理支持bitsandbytes库的int8推理，相比tencentpretrain中的LM推理脚本，加入了Batch推理。
优化推理逻辑在Multi-head Attention中加入了key和value的cache，每次inference只需要输入新生成的token。
大模型多卡推理支持张量并行的多卡推理。
微服务部署支持简单的flask部署以及gradio在线可视化部署。
LoRA模型推理施工中，计划支持使用LoRA训练的模型。

示例代码

from modelscope.utils.constant import Tasks
from modelscope.pipelines import pipeline
pipe = pipeline(task=Tasks.text_generation, model='AI-ModelScope/ChatFlow-7B', device_map='auto', model_revision='v1.0.0')
inputs = 'What do you think of OpenAI organization?'
result = pipe(inputs,batch_size=1, world_size=2, seq_length=118, 
            use_int8=False, top_k=30, top_p=1, temperature=0.85, repetition_penalty_range=1024,
            repetition_penalty_slope=0, repetition_penalty=1.15)
print(result)