通义千问7B模型开源,魔搭最佳实践来了

本文涉及的产品
交互式建模 PAI-DSW,5000CU*H 3个月
模型训练 PAI-DLC,5000CU*H 3个月
模型在线服务 PAI-EAS,A10/V100等 500元 1个月
简介: 通义千问开源!阿里云开源通义千问70亿参数模型,包括通用模型Qwen-7B-Base和对话模型Qwen-7B-Chat,两款模型均已上线ModelScope魔搭社区,开源、免费、可商用,欢迎大家来体验。

导读

通义千问开源!阿里云开源通义千问70亿参数模型,包括通用模型Qwen-7B和对话模型Qwen-7B-Chat,两款模型均已上线ModelScope魔搭社区,开源、免费、可商用,欢迎大家来体验。


模型体验链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary


环境配置与安装

本文在ModelScope的Notebook的环境(这里以PAI-DSW为例)配置下运行 (可以单卡运行, 显存要求24G)


服务器连接与环境准备

1、进入ModelScope首页:modelscope.cn,进入我的Notebook


2、选择GPU环境,进入PAI-DSW在线开发环境


3、新建Notebook



模型链接及下载

Qwen系列模型现已在ModelScope社区开源,包括:


Qwen-7B

模型链接:https://modelscope.cn/models/qwen/Qwen-7B/summary


Qwen-7B-Chat

模型链接:https://modelscope.cn/models/Qwen/Qwen-7b-chat/summary


社区支持直接下载模型的repo:

frommodelscope.hub.snapshot_downloadimportsnapshot_downloadmodel_dir=snapshot_download('Qwen/Qwen-7b-chat', 'v1.0.0')


或者通过如下代码,实现模型下载,以及load model, tokenizer:

defget_model_tokenizer_Qwen(model_dir: str,
torch_dtype: Dtype,
load_model: bool=True):
config=read_config(model_dir)
logger.info(config)
model_config=QwenConfig.from_pretrained(model_dir)
model_config.torch_dtype=torch_dtypelogger.info(model_config)
tokenizer=QwenTokenizer.from_pretrained(model_dir)
model=Noneifload_model:
model=Model.from_pretrained(
model_dir,
cfg_dict=config,
config=model_config,
device_map='auto',
torch_dtype=torch_dtype)
returnmodel, tokenizerget_model_tokenizer_Qwen(model_dir, torch.bfloat16)


创空间体验

Qwen-7B-Chat Bot创空间链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary


欢迎小伙伴们来创空间体验Qwen-7B-Chat的模型效果👏~


模型推理

Qwen-7B-Chat推理代码:

importosos.environ['CUDA_VISIBLE_DEVICES'] ='0'frommodelscope.pipelinesimportpipelinefrommodelscope.utils.constantimportTasksmodel_id='Qwen/Qwen-7b-chat'pipe=pipeline(
task=Tasks.chat, model=model_id, device_map='auto')
history=Nonesystem='You are a helpful assistant.'text='浙江的省会在哪里?'results=pipe(text, history=history, system=system)
response, history=results['response'], results['history']
print(f'Response: {response}')
text='它有什么好玩的地方呢?'results=pipe(text, history=history, system=system)
response, history=results['response'], results['history']
print(f'Response: {response}')
"""Response: 浙江的省会是杭州。Response: 杭州是一座历史悠久、文化底蕴深厚的城市,拥有许多著名景点,如西湖、西溪湿地、灵隐寺、千岛湖等,其中西湖是杭州最著名的景点,被誉为“天下第一湖”。此外,杭州还有许多古迹、文化街区、美食和艺术空间等,值得一去。"""


Qwen-7B推理代码:

importosos.environ['CUDA_VISIBLE_DEVICES'] ='0'frommodelscope.pipelinesimportpipelinefrommodelscope.utils.constantimportTasksmodel_id='Qwen/Qwen-7b'pipeline_ins=pipeline(
task=Tasks.text_generation, model=model_id, device_map='auto')
text='蒙古国的首都是乌兰巴托(Ulaanbaatar)\n冰岛的首都是雷克雅未克(Reykjavik)\n埃塞俄比亚的首都是'result=pipeline_ins(text)
print(result['text'])


SFT数据集链接和下载

这里使用魔搭上开源的数据集finance_en(包含68912条金融数据)作为微调数据集:

frommodelscopeimportMsDatasetfinance_en=MsDataset.load(
'wyj123456/finance_en', split='train').to_hf_dataset()
print(len(finance_en["instruction"]))
print(finance_en[0])
"""Out68912{'instruction': 'For a car, what scams can be plotted with 0% financing vs rebate?', 'input': None, 'output': "The car deal makes money 3 ways. If you pay in one lump payment. If the payment is greater than what they paid for the car, plus their expenses, they make a profit. They loan you the money. You make payments over months or years, if the total amount you pay is greater than what they paid for the car, plus their expenses, plus their finance expenses they make money. Of course the money takes years to come in, or they sell your loan to another business to get the money faster but in a smaller amount. You trade in a car and they sell it at a profit. Of course that new transaction could be a lump sum or a loan on the used car... They or course make money if you bring the car back for maintenance, or you buy lots of expensive dealer options. Some dealers wave two deals in front of you: get a 0% interest loan. These tend to be shorter 12 months vs 36,48,60 or even 72 months. The shorter length makes it harder for many to afford. If you can't swing the 12 large payments they offer you at x% loan for y years that keeps the payments in your budget. pay cash and get a rebate. If you take the rebate you can't get the 0% loan. If you take the 0% loan you can't get the rebate. The price you negotiate minus the rebate is enough to make a profit. The key is not letting them know which offer you are interested in. Don't even mention a trade in until the price of the new car has been finalized. Otherwise they will adjust the price, rebate, interest rate, length of loan,  and trade-in value to maximize their profit. The suggestion of running the numbers through a spreadsheet is a good one. If you get a loan for 2% from your bank/credit union for 3 years and the rebate from the dealer, it will cost less in total than the 0% loan from the dealer. The key is to get the loan approved by the bank/credit union before meeting with the dealer. The money from the bank looks like cash to the dealer.", 'text': None}"""


模型训练最佳实践

微调Qwen-7B模型。这里基于ModelScope的开源轻量化微调工具swift来实现。


开源代码:https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/llm_sft.py


git clone swift后,运行sft代码:

# 获取示例代码gitclonehttps://github.com/modelscope/swift.gitcdswift/examples/pytorch/llm# sftbashrun_sft.sh


具体的代码部分:

导入相关的库

# note: utils can be found ata# `https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/utils`# it is recommended that you git clone the swift repo importosfromdataclassesimportdataclass, fieldfromfunctoolsimportpartialfromtypesimportMethodTypefromtypingimportList, OptionalimporttorchfromtorchimportTensorfromutilsimport (DATASET_MAPPING, DEFAULT_PROMPT, MODEL_MAPPING, 
get_dataset, get_model_tokenizer, plot_images, 
process_dataset, select_dtype)
fromswiftimport (HubStrategy, Seq2SeqTrainer, 
Seq2SeqTrainingArguments,
get_logger)
fromswift.utilsimport (add_version_to_work_dir, parse_args, 
print_model_info, seed_everything, 
show_freeze_layers)
fromswift.utils.llm_utilsimport (data_collate_fn, print_example,
stat_dataset, tokenize_function)
logger=get_logger()


命令行参数定义

@dataclassclassSftArguments:
model_type: str=field(
default='Qwen-7b',
metadata={'choices': list(MODEL_MAPPING.keys())})
sft_type: str=field(
default='lora', metadata={'choices': ['lora', 'full']})
output_dir: Optional[str] =Noneseed: int=42resume_from_ckpt: Optional[str] =Nonedtype: Optional[str] =field(
default=None, metadata={'choices': {'bf16', 'fp16', 'fp32'}})
ignore_args_error: bool=False# True: notebook compatibilitydataset: str=field(
default='finance-en',
metadata={'help': f'dataset choices: {list(DATASET_MAPPING.keys())}'})
dataset_seed: int=42dataset_sample: Optional[int] =Nonedataset_test_size: float=0.01prompt: str=DEFAULT_PROMPTmax_length: Optional[int] =2048lora_target_modules: Optional[List[str]] =Nonelora_rank: int=8lora_alpha: int=32lora_dropout_p: float=0.1gradient_checkpoint: bool=Truebatch_size: int=1num_train_epochs: int=1optim: str='adamw_torch'learning_rate: Optional[float] =Noneweight_decay: float=0.01gradient_accumulation_steps: int=16max_grad_norm: float=1.lr_scheduler_type: str='cosine'warmup_ratio: float=0.1eval_steps: int=50save_steps: Optional[int] =Nonesave_total_limit: int=2logging_steps: int=5push_to_hub: bool=False# 'user_name/repo_name' or 'repo_name'hub_model_id: Optional[str] =Nonehub_private_repo: bool=Truehub_strategy: HubStrategy=HubStrategy.EVERY_SAVE# None: use env var `MODELSCOPE_API_TOKEN`hub_token: Optional[str] =Nonedef__post_init__(self):
ifself.sft_type=='lora':
ifself.learning_rateisNone:
self.learning_rate=1e-4ifself.save_stepsisNone:
self.save_steps=self.eval_stepselifself.sft_type=='full':
ifself.learning_rateisNone:
self.learning_rate=1e-5ifself.save_stepsisNone:
# Saving the model takes a long timeself.save_steps=self.eval_steps*4else:
raiseValueError(f'sft_type: {self.sft_type}')
ifself.output_dirisNone:
self.output_dir='runs'self.output_dir=os.path.join(self.output_dir, self.model_type)
ifself.lora_target_modulesisNone:
self.lora_target_modules=MODEL_MAPPING[
self.model_type]['lora_TM']
self.torch_dtype, self.fp16, self.bf16=select_dtype(
self.dtype, self.model_type)
ifself.hub_model_idisNone:
self.hub_model_id=f'{self.model_type}-sft'


导入模型

seed_everything(args.seed)
# ### Load Model and Tokenizermodel, tokenizer=get_model_tokenizer(args.model_type, torch_dtype=args.torch_dtype)
ifargs.gradient_checkpoint:
model.gradient_checkpointing_enable()
model.enable_input_require_grads()


准备LoRA

# ### Prepare loraifargs.sft_type=='lora':
fromswiftimportLoRAConfig, Swiftifargs.resume_from_ckptisNone:
lora_config=LoRAConfig(
r=args.lora_rank,
target_modules=args.lora_target_modules,
lora_alpha=args.lora_alpha,
lora_dropout=args.lora_dropout_p)
logger.info(f'lora_config: {lora_config}')
model=Swift.prepare_model(model, lora_config)
else:
model=Swift.from_pretrained(
model, args.resume_from_ckpt, is_trainable=True)
show_freeze_layers(model)
print_model_info(model)
# check the device and dtype of the model_p: Tensor=list(model.parameters())[-1]
logger.info(f'device: {_p.device}, dtype: {_p.dtype}')


导入数据集

# ### Load Datasetdataset=get_dataset(args.dataset.split(','))
train_dataset, val_dataset=process_dataset(dataset,
args.dataset_test_size,
args.dataset_sample,
args.dataset_seed)
tokenize_func=partial(
tokenize_function,
tokenizer=tokenizer,
prompt=args.prompt,
max_length=args.max_length)
train_dataset=train_dataset.map(tokenize_func)
val_dataset=val_dataset.map(tokenize_func)
deldataset# Data analysisstat_dataset(train_dataset)
stat_dataset(val_dataset)
data_collator=partial(data_collate_fn, tokenizer=tokenizer)
print_example(train_dataset[0], tokenizer)


配置Config

# ### Setting trainer_argsoutput_dir=add_version_to_work_dir(args.output_dir)
trainer_args=Seq2SeqTrainingArguments(
output_dir=output_dir,
do_train=True,
do_eval=True,
evaluation_strategy='steps',
per_device_train_batch_size=args.batch_size,
per_device_eval_batch_size=args.batch_size,
gradient_accumulation_steps=args.gradient_accumulation_steps,
learning_rate=args.learning_rate,
weight_decay=args.weight_decay,
max_grad_norm=args.max_grad_norm,
num_train_epochs=args.num_train_epochs,
lr_scheduler_type=args.lr_scheduler_type,
warmup_ratio=args.warmup_ratio,
logging_steps=args.logging_steps,
save_strategy='steps',
save_steps=args.save_steps,
save_total_limit=args.save_total_limit,
bf16=args.bf16,
fp16=args.fp16,
eval_steps=args.eval_steps,
dataloader_num_workers=1,
load_best_model_at_end=True,
metric_for_best_model='loss',
greater_is_better=False,
sortish_sampler=True,
optim=args.optim,
hub_model_id=args.hub_model_id,
hub_private_repo=args.hub_private_repo,
hub_strategy=args.hub_strategy,
hub_token=args.hub_token,
push_to_hub=args.push_to_hub,
resume_from_checkpoint=args.resume_from_ckpt)


开启微调:

# ### Finetuningtrainer=Seq2SeqTrainer(
model=model,
args=trainer_args,
data_collator=data_collator,
train_dataset=train_dataset,
eval_dataset=val_dataset,
tokenizer=tokenizer,
)
trainer.train()


可视化:

Tensorboard 命令: (e.g.)

tensorboard --logdir runs/Qwen-7b/v0-20230802-170622/runs --port 6006

# ### Visualizationimages_dir=os.path.join(output_dir, 'images')
tb_dir=os.path.join(output_dir, 'runs')
folder_name=os.listdir(tb_dir)[0]
tb_dir=os.path.join(tb_dir, folder_name)
plot_images(images_dir, tb_dir, ['train/loss'], 0.9)
ifargs.push_to_hub:
trainer._add_patterns_to_gitignores(['images/'])
trainer.push_to_hub()



资源消耗

Qwen-7B用lora的方式微调的显存占用如下,大约在21G. (batch_size=1, max_length=2048)



使用训练后的模型进行推理

代码链接:https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/llm_infer.py

运行infer脚本:

# inferbashrun_infer.sh


具体的代码部分:

# note: utils can be found ata# `https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/utils`# it is recommended that you git clone the swift repo # ### Setting up experimental environment.importosfromdataclassesimportdataclass, fieldfromfunctoolsimportpartialfromtypingimportList, OptionalimporttorchfromtransformersimportGenerationConfig, TextStreamerfromutilsimport (DATASET_MAPPER, DEFAULT_PROMPT, MODEL_MAPPER, get_dataset,
get_model_tokenizer, inference, parse_args, process_dataset,
tokenize_function)
frommodelscopeimportget_loggerfrommodelscope.swiftimportLoRAConfig, Swiftlogger=get_logger()
@dataclassclassInferArguments:
model_type: str=field(
default='Qwen-7b', metadata={'choices': list(MODEL_MAPPER.keys())})
sft_type: str=field(
default='lora', metadata={'choices': ['lora', 'full']})
ckpt_path: str='/path/to/your/iter_xxx.pth'eval_human: bool=False# False: eval test_datasetignore_args_error: bool=False# True: notebook compatibilitydataset: str=field(
default='finance_en',
metadata={'help': f'dataset choices: {list(DATASET_MAPPER.keys())}'})
dataset_seed: int=42dataset_sample: Optional[int] =Nonedataset_test_size: float=0.01prompt: str=DEFAULT_PROMPTmax_length: Optional[int] =2048lora_target_modules: Optional[List[str]] =Nonelora_rank: int=8lora_alpha: int=32lora_dropout_p: float=0.1max_new_tokens: int=512temperature: float=0.9top_k: int=50top_p: float=0.9def__post_init__(self):
ifself.lora_target_modulesisNone:
self.lora_target_modules=MODEL_MAPPER[self.model_type]['lora_TM']
ifnotos.path.isfile(self.ckpt_path):
raiseValueError(
f'Please enter a valid ckpt_path: {self.ckpt_path}')
defllm_infer(args: InferArguments) ->None:
# ### Loading Model and Tokenizersupport_bf16=torch.cuda.is_bf16_supported()
ifnotsupport_bf16:
logger.warning(f'support_bf16: {support_bf16}')
model, tokenizer, _=get_model_tokenizer(
args.model_type, torch_dtype=torch.bfloat16)
# ### Preparing loraifargs.sft_type=='lora':
lora_config=LoRAConfig(
replace_modules=args.lora_target_modules,
rank=args.lora_rank,
lora_alpha=args.lora_alpha,
lora_dropout=args.lora_dropout_p,
pretrained_weights=args.ckpt_path)
logger.info(f'lora_config: {lora_config}')
model=Swift.prepare_model(model, lora_config)
elifargs.sft_type=='full':
state_dict=torch.load(args.ckpt_path, map_location='cpu')
model.load_state_dict(state_dict)
else:
raiseValueError(f'args.sft_type: {args.sft_type}')
# ### Inferencetokenize_func=partial(
tokenize_function,
tokenizer=tokenizer,
prompt=args.prompt,
max_length=args.max_length)
streamer=TextStreamer(
tokenizer, skip_prompt=True, skip_special_tokens=True)
generation_config=GenerationConfig(
max_new_tokens=args.max_new_tokens,
temperature=args.temperature,
top_k=args.top_k,
top_p=args.top_p,
do_sample=True,
pad_token_id=tokenizer.eos_token_id)
logger.info(f'generation_config: {generation_config}')
ifargs.eval_human:
whileTrue:
instruction=input('<<< ')
data= {'instruction': instruction}
input_ids=tokenize_func(data)['input_ids']
inference(input_ids, model, tokenizer, streamer, generation_config)
print('-'*80)
else:
dataset=get_dataset(args.dataset.split(','))
_, test_dataset=process_dataset(dataset, args.dataset_test_size,
args.dataset_sample,
args.dataset_seed)
mini_test_dataset=test_dataset.select(range(10))
deldatasetfordatainmini_test_dataset:
output=data['output']
data['output'] =Noneinput_ids=tokenize_func(data)['input_ids']
inference(input_ids, model, tokenizer, streamer, generation_config)
print()
print(f'[LABELS]{output}')
print('-'*80)
# input('next[ENTER]')if__name__=='__main__':
args, remaining_argv=parse_args(InferArguments)
iflen(remaining_argv) >0:
ifargs.ignore_args_error:
logger.warning(f'remaining_argv: {remaining_argv}')
else:
raiseValueError(f'remaining_argv: {remaining_argv}')
llm_infer(args)


模型体验链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary

相关文章
|
1月前
|
并行计算 PyTorch 算法框架/工具
社区供稿 | 本地部署通义千问大模型做RAG验证
这篇文章中,我们通过将模搭社区开源的大模型部署到本地,并实现简单的对话和RAG。
|
14天前
|
自然语言处理 Java 测试技术
使用PostMan请求阿里云通义千问大模型
本文介绍如果通过postman调用阿里云通义千问API,然后介绍如果使用多语言集成,最后介绍了快速使用postman压测创建的API请求。
308 1
如何快速体验通义千问全系列模型能力
体验通义千问全系列模型能力,需在阿里云开通百炼服务。访问阿里云百炼控制台的“模型广场”,可选择包括通义系列在内的多种模型。按照指南进行训练、部署和评测。详情参阅官方文档。
|
1月前
|
人工智能
通义千问大模型价格直线下调,优惠升级!更有新用户限时免费领取3600万额度!
通义大模型全线9款直降,最高达97%,阿里云MaaS(模型即服务)让推理成本大幅降低,加速AI应用爆发。 即刻登录阿里云百炼官网https://bailian.aliyun.com调用体验,新用户免费限时赠送3600万tokens!
|
27天前
|
Java 程序员 API
全民上手大模型--ollama+langchain+通义千问零费用java+python跑通本机大模型
全民上手大模型--ollama+langchain+通义千问零费用java+python跑通本机大模型
330 0
|
1月前
|
自然语言处理 Swift
千亿大模型来了!通义千问110B模型开源,魔搭社区推理、微调最佳实践
近期开源社区陆续出现了千亿参数规模以上的大模型,这些模型都在各项评测中取得杰出的成绩。今天,通义千问团队开源1100亿参数的Qwen1.5系列首个千亿参数模型Qwen1.5-110B,该模型在基础能力评估中与Meta-Llama3-70B相媲美,在Chat评估中表现出色,包括MT-Bench和AlpacaEval 2.0。
|
1月前
|
人工智能 算法 知识图谱
大模型首次接入天文望远镜!基于通义千问,“星语3.0”发布
大模型首次接入天文望远镜!基于通义千问,“星语3.0”发布
111 0
|
1月前
|
自然语言处理 搜索推荐 API
通义千问API:用4行代码对话大模型
本章将通过一个简单的例子,让你快速进入到通义千问大模型应用开发的世界。
232877 105
通义千问API:用4行代码对话大模型
|
1月前
|
机器学习/深度学习 人工智能 算法
通义千问Qwen-72B-Chat大模型在PAI平台的微调实践
本文将以Qwen-72B-Chat为例,介绍如何在PAI平台的快速开始PAI-QuickStart和交互式建模工具PAI-DSW中高效微调千问大模型。
|
1月前
|
人工智能 API 异构计算
基于PAI-EAS一键部署通义千问模型
本教程中,您将学习如何在阿里云模型在线服务(PAI-EAS)一键部署基于开源模型通义千问的WebUI应用,以及使用WebUI和API进行模型推理。