1. 导读

3月6日，零一万物发布并开源了Yi系列中的“理科状元”——Yi-9B。Yi-9B 是目前 Yi 系列模型中代码和数学能力最强的模型，实际参数为 8.8B，默认上下文长度4K tokens，是在 Yi-6B （使用了 3.1T tokens 训练）的基础上，使用了 0.8T tokens 进行继续训练。

官方总结Yi-9B的核心模型优势在于：

1. 消费级显卡可用，使用成本友好：

Yi-9B（BF 16）和其量化版 Yi-9B（Int8）都能在消费级显卡上轻松部署，使用成本较低，开发者友好。

表 1：Yi-9B 推理资源

2. 代码和数学能力出色，综合实力强劲

综合能力（Mean-All）：在尺寸相近的开源模型（对比DeepSeek-Coder、DeepSeek-Math、Mistral-7B、SOLAR-10.7B 和 Gemma-7B）中表现优秀。

图 1：性能对比 - 综合能力

代码能力（Mean-Code）：性能稍弱于 DeepSeek-Coder-7B，超越了 Yi-34B、SOLAR-10.7B、Mistral-7B 和 Gemma-7B。

图 2：性能对比 - 代码能力

数学能力（Mean-Math）：性能稍弱于 DeepSeek-Math-7B，超越了 SOLAR-10.7B、Mistral-7B 和 Gemma-7B。

图 3：性能对比 - 数学能力

常识和推理能力（Mean-Text）：性能与 Mistral-7B、SOLAR-10.7B 和 Gemma-7B 不相上下。

图 4：性能对比 - 常识和推理能力

语言能力：相比于其他相近尺寸的模型，Yi-9B 不仅具备不错的英文能力，还拥有 Yi 系列模型广受好评的强大中文能力。

表 2：性能对比 - 详细数据

接下来下为大家带来Yi-9B魔搭社区推理、微调最佳实践教程，希望对感兴趣的开发者有帮助。

2. 环境配置与安装

本文使用的模型为Yi-9B模型，可在ModelScope的Notebook的环境（这里以PAI-DSW为例）的配置下运行（显存24G）。

2.1 环境配置与安装

本文主要演示的模型推理代码可在魔搭社区免费实例PAI-DSW的配置下运行（显存24G）：

点击模型右侧Notebook快速开发按钮，选择GPU环境

打开Terminal环境：

3. 模型链接和下载

Yi-9B现已在ModelScope社区开源，模型链接

社区支持直接下载模型的repo：

from modelscope import snapshot_download
model_dir = snapshot_download("01ai/Yi-9B")

4. Yi-9B模型推理

模型推理：

from modelscope import AutoModelForCausalLM, AutoTokenizer,snapshot_download
import torch
model_path = snapshot_download("01ai/Yi-9B")
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
# Since transformers 4.35.0, the GPT-Q/AWQ model can be loaded using AutoModelForCausalLM.
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype=torch.bfloat16
).eval()
# Prompt content: "hi"
messages = [
    {"role": "user", "content": "浙江的省会是"}
]
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'))
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
# Model response: "Hello! How can I assist you today?"
print(response)

显存占用：

5. Yi-9模型微调和微调后推理

我们使用swift来对模型进行微调, swift是魔搭社区官方提供的LLM微调推理框架.

微调代码开源地址

我们使用ms-agent数据集进行训练, 并混合了通用数据集和自我认知训练, 将base模型训练成chat模型.

环境准备:

git clone https://github.com/modelscope/swift.git
cd swift
pip install .[llm]

微调脚本: LoRA + DDP

如果你要使用3090进行训练, 你可以使用mp+ddp或者zero3的技术

# https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/yi_9b/lora_mp_ddp
# Experimental environment: 4 * A100
# 4 * 30GB GPU memory
# Train a chat model with agent capabilities and self-cognition from the base.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NPROC_PER_NODE=4 \
swift sft \
    --model_type yi-9b \
    --sft_type lora \
    --tuner_backend swift \
    --template_type default \
    --dtype AUTO \
    --output_dir output \
    --dataset ms-agent \
    --train_dataset_sample 20000 \
    --train_dataset_mix_ratio 2 \
    --num_train_epochs 3 \
    --max_length 4096 \
    --check_dataset_strategy warning \
    --lora_rank 8 \
    --lora_alpha 32 \
    --lora_dropout_p 0.05 \
    --lora_target_modules ALL \
    --lora_modules_to_save EMBEDDING LN \
    --gradient_checkpointing true \
    --batch_size 1 \
    --weight_decay 0.1 \
    --learning_rate 5e-5 \
    --gradient_accumulation_steps 16 \
    --max_grad_norm 0.5 \
    --warmup_ratio 0.03 \
    --eval_steps 100 \
    --save_steps 100 \
    --save_total_limit 2 \
    --logging_steps 10 \
    --use_flash_attn false \
    --self_cognition_sample 2000 \
    --model_name 小黄 'Xiao Huang' \
    --model_author 魔搭 ModelScope \

训练过程也支持本地数据集，需要指定如下参数：

--custom_train_dataset_path xxx.jsonl \
--custom_val_dataset_path yyy.jsonl

自定义数据集的格式可以参考:链接

微调后推理脚本: （这里的ckpt_dir需要修改为训练生成的checkpoint文件夹）

# Experimental environment: 2 * 3090
CUDA_VISIBLE_DEVICES=0,1 \
swift infer \
    --ckpt_dir "output/yi-9b/vx-xxx/checkpoint-xxx" \
    --load_dataset_config true \
    --max_length 2048 \
    --use_flash_attn true \
    --max_new_tokens 2048 \
    --temperature 0.3 \
    --top_p 0.7 \
    --repetition_penalty 1. \
    --do_sample true \
    --merge_lora false

微调的可视化结果:

训练后生成样例:

<<< 你是谁？
我是小黄，一个基于大规模语言模型GPT（生成预训练变换器）的人工智能聊天机器人。
--------------------------------------------------
<<< 你是谁训练的？
我是由魔搭团队训练和开发的。
--------------------------------------------------
<<< 浙江的省会在哪？
浙江的省会是杭州。杭州是一座美丽的城市，拥有西湖、雷峰塔等著名景点，也是阿里巴巴等公司的总部所在地。
--------------------------------------------------
<<< 这有什么好吃的？
当然有！杭州有很多美食，比如西湖醋鱼、东坡肉、龙井虾仁等等。如果你有机会去杭州，一定要品尝一下这些美食。
--------------------------------------------------
<<< 4564+456=？
根据您的要求，4564+456=5020。
--------------------------------------------------
<<< multi-line
[INFO:swift] End multi-line input with `#`.
[INFO:swift] Input `single-line` to switch to single-line input mode.
<<<[M] reset-system
<<<[MS] Answer the following questions as best you can. You have access to the following APIs:
1. todo: Call this tool to interact with the todo API. What is the todo API useful for? 管理待办事项. Parameters: [{"name": "action", "description": "需要执行的动作，包括添加、查询、删除等", "required": "False"}, {"name": "content", "description": "待办事项的内容", "required": "False"}]
2. 中文分词器: Call this tool to interact with the 中文分词器 API. What is the 中文分词器 API useful for? 通过Python解释器执行中文分词. Parameters: [{"name": "text", "description": "需要进行中文分词的文本", "required": "False"}]
3. pinyin: Call this tool to interact with the pinyin API. What is the pinyin API useful for? 将汉字转换成拼音. Parameters: [{"name": "hanzi", "description": "需要转换的汉字", "required": "False"}]
4. invoice_generation: Call this tool to interact with the invoicegeneration API. What is the invoicegeneration API useful for? 根据提供的订单信息和发票模板，生成符合要求的发票. Parameters: [{"name": "order_info", "description": "订单信息，包括订单号、商品名称、数量、单价等", "required": "False"}, {"name": "template_id", "description": "发票模板ID，用于选择合适的发票模板", "required": "False"}, {"name": "company_info", "description": "公司信息，包括公司名称、地址、电话等", "required": "False"}]
5. translator: Call this tool to interact with the translator API. What is the translator API useful for? 将指定文本翻译为多种语言. Parameters: [{"name": "text", "description": "需要翻译的文本", "required": "False"}, {"name": "source", "description": "原始语言代码，可选参数，默认为auto", "required": "False"}, {"name": "target", "description": "目标语言代码，必选参数", "required": "False"}]
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[todo, 中文分词器, pinyin, invoice_generation, translator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!#
<<<[M] 将 "Hello, World!" 翻译为德语#
Action: translator
Action Input: {'text': 'Hello, World!', 'target': 'de'}
Observation: {'translation': 'Hallo, Welt!'}
Thought: I now know the final answer
Final Answer: 根据API调用结果，将"Hello, World!"翻译为德语的结果为"Hallo, Welt!"。希望这个翻译结果能够帮助您更好地理解德语。

点击直达模型开源卡片：模型详情页 · 魔搭社区 (modelscope.cn)

零一万物开源Yi系列“理科状元”Yi-9B，消费级显卡可跑，魔搭社区最佳实践

1. 导读

2. 环境配置与安装

2.1 环境配置与安装

3. 模型链接和下载

4. Yi-9B模型推理

5. Yi-9模型微调和微调后推理

相关文章

相关解决方案

热门文章

最新文章

相关电子书