通义千问7B模型开源,魔搭最佳实践来了

本文涉及的产品
模型训练 PAI-DLC,100CU*H 3个月
交互式建模 PAI-DSW,每月250计算时 3个月
模型在线服务 PAI-EAS,A10/V100等 500元 1个月
简介: 通义千问开源!阿里云开源通义千问70亿参数模型,包括通用模型Qwen-7B-Base和对话模型Qwen-7B-Chat,两款模型均已上线ModelScope魔搭社区,开源、免费、可商用,欢迎大家来体验。

导读

通义千问开源!阿里云开源通义千问70亿参数模型,包括通用模型Qwen-7B和对话模型Qwen-7B-Chat,两款模型均已上线ModelScope魔搭社区,开源、免费、可商用,欢迎大家来体验。


模型体验链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary


环境配置与安装

本文在ModelScope的Notebook的环境(这里以PAI-DSW为例)配置下运行 (可以单卡运行, 显存要求24G)


服务器连接与环境准备

1、进入ModelScope首页:modelscope.cn,进入我的Notebook


2、选择GPU环境,进入PAI-DSW在线开发环境


3、新建Notebook



模型链接及下载

Qwen系列模型现已在ModelScope社区开源,包括:


Qwen-7B

模型链接:https://modelscope.cn/models/qwen/Qwen-7B/summary


Qwen-7B-Chat

模型链接:https://modelscope.cn/models/Qwen/Qwen-7b-chat/summary


社区支持直接下载模型的repo:

frommodelscope.hub.snapshot_downloadimportsnapshot_downloadmodel_dir=snapshot_download('Qwen/Qwen-7b-chat', 'v1.0.0')


或者通过如下代码,实现模型下载,以及load model, tokenizer:

defget_model_tokenizer_Qwen(model_dir: str,
torch_dtype: Dtype,
load_model: bool=True):
config=read_config(model_dir)
logger.info(config)
model_config=QwenConfig.from_pretrained(model_dir)
model_config.torch_dtype=torch_dtypelogger.info(model_config)
tokenizer=QwenTokenizer.from_pretrained(model_dir)
model=Noneifload_model:
model=Model.from_pretrained(
model_dir,
cfg_dict=config,
config=model_config,
device_map='auto',
torch_dtype=torch_dtype)
returnmodel, tokenizerget_model_tokenizer_Qwen(model_dir, torch.bfloat16)


创空间体验

Qwen-7B-Chat Bot创空间链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary


欢迎小伙伴们来创空间体验Qwen-7B-Chat的模型效果👏~


模型推理

Qwen-7B-Chat推理代码:

importosos.environ['CUDA_VISIBLE_DEVICES'] ='0'frommodelscope.pipelinesimportpipelinefrommodelscope.utils.constantimportTasksmodel_id='Qwen/Qwen-7b-chat'pipe=pipeline(
task=Tasks.chat, model=model_id, device_map='auto')
history=Nonesystem='You are a helpful assistant.'text='浙江的省会在哪里?'results=pipe(text, history=history, system=system)
response, history=results['response'], results['history']
print(f'Response: {response}')
text='它有什么好玩的地方呢?'results=pipe(text, history=history, system=system)
response, history=results['response'], results['history']
print(f'Response: {response}')
"""Response: 浙江的省会是杭州。Response: 杭州是一座历史悠久、文化底蕴深厚的城市,拥有许多著名景点,如西湖、西溪湿地、灵隐寺、千岛湖等,其中西湖是杭州最著名的景点,被誉为“天下第一湖”。此外,杭州还有许多古迹、文化街区、美食和艺术空间等,值得一去。"""


Qwen-7B推理代码:

importosos.environ['CUDA_VISIBLE_DEVICES'] ='0'frommodelscope.pipelinesimportpipelinefrommodelscope.utils.constantimportTasksmodel_id='Qwen/Qwen-7b'pipeline_ins=pipeline(
task=Tasks.text_generation, model=model_id, device_map='auto')
text='蒙古国的首都是乌兰巴托(Ulaanbaatar)\n冰岛的首都是雷克雅未克(Reykjavik)\n埃塞俄比亚的首都是'result=pipeline_ins(text)
print(result['text'])


SFT数据集链接和下载

这里使用魔搭上开源的数据集finance_en(包含68912条金融数据)作为微调数据集:

frommodelscopeimportMsDatasetfinance_en=MsDataset.load(
'wyj123456/finance_en', split='train').to_hf_dataset()
print(len(finance_en["instruction"]))
print(finance_en[0])
"""Out68912{'instruction': 'For a car, what scams can be plotted with 0% financing vs rebate?', 'input': None, 'output': "The car deal makes money 3 ways. If you pay in one lump payment. If the payment is greater than what they paid for the car, plus their expenses, they make a profit. They loan you the money. You make payments over months or years, if the total amount you pay is greater than what they paid for the car, plus their expenses, plus their finance expenses they make money. Of course the money takes years to come in, or they sell your loan to another business to get the money faster but in a smaller amount. You trade in a car and they sell it at a profit. Of course that new transaction could be a lump sum or a loan on the used car... They or course make money if you bring the car back for maintenance, or you buy lots of expensive dealer options. Some dealers wave two deals in front of you: get a 0% interest loan. These tend to be shorter 12 months vs 36,48,60 or even 72 months. The shorter length makes it harder for many to afford. If you can't swing the 12 large payments they offer you at x% loan for y years that keeps the payments in your budget. pay cash and get a rebate. If you take the rebate you can't get the 0% loan. If you take the 0% loan you can't get the rebate. The price you negotiate minus the rebate is enough to make a profit. The key is not letting them know which offer you are interested in. Don't even mention a trade in until the price of the new car has been finalized. Otherwise they will adjust the price, rebate, interest rate, length of loan,  and trade-in value to maximize their profit. The suggestion of running the numbers through a spreadsheet is a good one. If you get a loan for 2% from your bank/credit union for 3 years and the rebate from the dealer, it will cost less in total than the 0% loan from the dealer. The key is to get the loan approved by the bank/credit union before meeting with the dealer. The money from the bank looks like cash to the dealer.", 'text': None}"""


模型训练最佳实践

微调Qwen-7B模型。这里基于ModelScope的开源轻量化微调工具swift来实现。


开源代码:https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/llm_sft.py


git clone swift后,运行sft代码:

# 获取示例代码gitclonehttps://github.com/modelscope/swift.gitcdswift/examples/pytorch/llm# sftbashrun_sft.sh


具体的代码部分:

导入相关的库

# note: utils can be found ata# `https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/utils`# it is recommended that you git clone the swift repo importosfromdataclassesimportdataclass, fieldfromfunctoolsimportpartialfromtypesimportMethodTypefromtypingimportList, OptionalimporttorchfromtorchimportTensorfromutilsimport (DATASET_MAPPING, DEFAULT_PROMPT, MODEL_MAPPING, 
get_dataset, get_model_tokenizer, plot_images, 
process_dataset, select_dtype)
fromswiftimport (HubStrategy, Seq2SeqTrainer, 
Seq2SeqTrainingArguments,
get_logger)
fromswift.utilsimport (add_version_to_work_dir, parse_args, 
print_model_info, seed_everything, 
show_freeze_layers)
fromswift.utils.llm_utilsimport (data_collate_fn, print_example,
stat_dataset, tokenize_function)
logger=get_logger()


命令行参数定义

@dataclassclassSftArguments:
model_type: str=field(
default='Qwen-7b',
metadata={'choices': list(MODEL_MAPPING.keys())})
sft_type: str=field(
default='lora', metadata={'choices': ['lora', 'full']})
output_dir: Optional[str] =Noneseed: int=42resume_from_ckpt: Optional[str] =Nonedtype: Optional[str] =field(
default=None, metadata={'choices': {'bf16', 'fp16', 'fp32'}})
ignore_args_error: bool=False# True: notebook compatibilitydataset: str=field(
default='finance-en',
metadata={'help': f'dataset choices: {list(DATASET_MAPPING.keys())}'})
dataset_seed: int=42dataset_sample: Optional[int] =Nonedataset_test_size: float=0.01prompt: str=DEFAULT_PROMPTmax_length: Optional[int] =2048lora_target_modules: Optional[List[str]] =Nonelora_rank: int=8lora_alpha: int=32lora_dropout_p: float=0.1gradient_checkpoint: bool=Truebatch_size: int=1num_train_epochs: int=1optim: str='adamw_torch'learning_rate: Optional[float] =Noneweight_decay: float=0.01gradient_accumulation_steps: int=16max_grad_norm: float=1.lr_scheduler_type: str='cosine'warmup_ratio: float=0.1eval_steps: int=50save_steps: Optional[int] =Nonesave_total_limit: int=2logging_steps: int=5push_to_hub: bool=False# 'user_name/repo_name' or 'repo_name'hub_model_id: Optional[str] =Nonehub_private_repo: bool=Truehub_strategy: HubStrategy=HubStrategy.EVERY_SAVE# None: use env var `MODELSCOPE_API_TOKEN`hub_token: Optional[str] =Nonedef__post_init__(self):
ifself.sft_type=='lora':
ifself.learning_rateisNone:
self.learning_rate=1e-4ifself.save_stepsisNone:
self.save_steps=self.eval_stepselifself.sft_type=='full':
ifself.learning_rateisNone:
self.learning_rate=1e-5ifself.save_stepsisNone:
# Saving the model takes a long timeself.save_steps=self.eval_steps*4else:
raiseValueError(f'sft_type: {self.sft_type}')
ifself.output_dirisNone:
self.output_dir='runs'self.output_dir=os.path.join(self.output_dir, self.model_type)
ifself.lora_target_modulesisNone:
self.lora_target_modules=MODEL_MAPPING[
self.model_type]['lora_TM']
self.torch_dtype, self.fp16, self.bf16=select_dtype(
self.dtype, self.model_type)
ifself.hub_model_idisNone:
self.hub_model_id=f'{self.model_type}-sft'


导入模型

seed_everything(args.seed)
# ### Load Model and Tokenizermodel, tokenizer=get_model_tokenizer(args.model_type, torch_dtype=args.torch_dtype)
ifargs.gradient_checkpoint:
model.gradient_checkpointing_enable()
model.enable_input_require_grads()


准备LoRA

# ### Prepare loraifargs.sft_type=='lora':
fromswiftimportLoRAConfig, Swiftifargs.resume_from_ckptisNone:
lora_config=LoRAConfig(
r=args.lora_rank,
target_modules=args.lora_target_modules,
lora_alpha=args.lora_alpha,
lora_dropout=args.lora_dropout_p)
logger.info(f'lora_config: {lora_config}')
model=Swift.prepare_model(model, lora_config)
else:
model=Swift.from_pretrained(
model, args.resume_from_ckpt, is_trainable=True)
show_freeze_layers(model)
print_model_info(model)
# check the device and dtype of the model_p: Tensor=list(model.parameters())[-1]
logger.info(f'device: {_p.device}, dtype: {_p.dtype}')


导入数据集

# ### Load Datasetdataset=get_dataset(args.dataset.split(','))
train_dataset, val_dataset=process_dataset(dataset,
args.dataset_test_size,
args.dataset_sample,
args.dataset_seed)
tokenize_func=partial(
tokenize_function,
tokenizer=tokenizer,
prompt=args.prompt,
max_length=args.max_length)
train_dataset=train_dataset.map(tokenize_func)
val_dataset=val_dataset.map(tokenize_func)
deldataset# Data analysisstat_dataset(train_dataset)
stat_dataset(val_dataset)
data_collator=partial(data_collate_fn, tokenizer=tokenizer)
print_example(train_dataset[0], tokenizer)


配置Config

# ### Setting trainer_argsoutput_dir=add_version_to_work_dir(args.output_dir)
trainer_args=Seq2SeqTrainingArguments(
output_dir=output_dir,
do_train=True,
do_eval=True,
evaluation_strategy='steps',
per_device_train_batch_size=args.batch_size,
per_device_eval_batch_size=args.batch_size,
gradient_accumulation_steps=args.gradient_accumulation_steps,
learning_rate=args.learning_rate,
weight_decay=args.weight_decay,
max_grad_norm=args.max_grad_norm,
num_train_epochs=args.num_train_epochs,
lr_scheduler_type=args.lr_scheduler_type,
warmup_ratio=args.warmup_ratio,
logging_steps=args.logging_steps,
save_strategy='steps',
save_steps=args.save_steps,
save_total_limit=args.save_total_limit,
bf16=args.bf16,
fp16=args.fp16,
eval_steps=args.eval_steps,
dataloader_num_workers=1,
load_best_model_at_end=True,
metric_for_best_model='loss',
greater_is_better=False,
sortish_sampler=True,
optim=args.optim,
hub_model_id=args.hub_model_id,
hub_private_repo=args.hub_private_repo,
hub_strategy=args.hub_strategy,
hub_token=args.hub_token,
push_to_hub=args.push_to_hub,
resume_from_checkpoint=args.resume_from_ckpt)


开启微调:

# ### Finetuningtrainer=Seq2SeqTrainer(
model=model,
args=trainer_args,
data_collator=data_collator,
train_dataset=train_dataset,
eval_dataset=val_dataset,
tokenizer=tokenizer,
)
trainer.train()


可视化:

Tensorboard 命令: (e.g.)

tensorboard --logdir runs/Qwen-7b/v0-20230802-170622/runs --port 6006

# ### Visualizationimages_dir=os.path.join(output_dir, 'images')
tb_dir=os.path.join(output_dir, 'runs')
folder_name=os.listdir(tb_dir)[0]
tb_dir=os.path.join(tb_dir, folder_name)
plot_images(images_dir, tb_dir, ['train/loss'], 0.9)
ifargs.push_to_hub:
trainer._add_patterns_to_gitignores(['images/'])
trainer.push_to_hub()



资源消耗

Qwen-7B用lora的方式微调的显存占用如下,大约在21G. (batch_size=1, max_length=2048)



使用训练后的模型进行推理

代码链接:https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/llm_infer.py

运行infer脚本:

# inferbashrun_infer.sh


具体的代码部分:

# note: utils can be found ata# `https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/utils`# it is recommended that you git clone the swift repo # ### Setting up experimental environment.importosfromdataclassesimportdataclass, fieldfromfunctoolsimportpartialfromtypingimportList, OptionalimporttorchfromtransformersimportGenerationConfig, TextStreamerfromutilsimport (DATASET_MAPPER, DEFAULT_PROMPT, MODEL_MAPPER, get_dataset,
get_model_tokenizer, inference, parse_args, process_dataset,
tokenize_function)
frommodelscopeimportget_loggerfrommodelscope.swiftimportLoRAConfig, Swiftlogger=get_logger()
@dataclassclassInferArguments:
model_type: str=field(
default='Qwen-7b', metadata={'choices': list(MODEL_MAPPER.keys())})
sft_type: str=field(
default='lora', metadata={'choices': ['lora', 'full']})
ckpt_path: str='/path/to/your/iter_xxx.pth'eval_human: bool=False# False: eval test_datasetignore_args_error: bool=False# True: notebook compatibilitydataset: str=field(
default='finance_en',
metadata={'help': f'dataset choices: {list(DATASET_MAPPER.keys())}'})
dataset_seed: int=42dataset_sample: Optional[int] =Nonedataset_test_size: float=0.01prompt: str=DEFAULT_PROMPTmax_length: Optional[int] =2048lora_target_modules: Optional[List[str]] =Nonelora_rank: int=8lora_alpha: int=32lora_dropout_p: float=0.1max_new_tokens: int=512temperature: float=0.9top_k: int=50top_p: float=0.9def__post_init__(self):
ifself.lora_target_modulesisNone:
self.lora_target_modules=MODEL_MAPPER[self.model_type]['lora_TM']
ifnotos.path.isfile(self.ckpt_path):
raiseValueError(
f'Please enter a valid ckpt_path: {self.ckpt_path}')
defllm_infer(args: InferArguments) ->None:
# ### Loading Model and Tokenizersupport_bf16=torch.cuda.is_bf16_supported()
ifnotsupport_bf16:
logger.warning(f'support_bf16: {support_bf16}')
model, tokenizer, _=get_model_tokenizer(
args.model_type, torch_dtype=torch.bfloat16)
# ### Preparing loraifargs.sft_type=='lora':
lora_config=LoRAConfig(
replace_modules=args.lora_target_modules,
rank=args.lora_rank,
lora_alpha=args.lora_alpha,
lora_dropout=args.lora_dropout_p,
pretrained_weights=args.ckpt_path)
logger.info(f'lora_config: {lora_config}')
model=Swift.prepare_model(model, lora_config)
elifargs.sft_type=='full':
state_dict=torch.load(args.ckpt_path, map_location='cpu')
model.load_state_dict(state_dict)
else:
raiseValueError(f'args.sft_type: {args.sft_type}')
# ### Inferencetokenize_func=partial(
tokenize_function,
tokenizer=tokenizer,
prompt=args.prompt,
max_length=args.max_length)
streamer=TextStreamer(
tokenizer, skip_prompt=True, skip_special_tokens=True)
generation_config=GenerationConfig(
max_new_tokens=args.max_new_tokens,
temperature=args.temperature,
top_k=args.top_k,
top_p=args.top_p,
do_sample=True,
pad_token_id=tokenizer.eos_token_id)
logger.info(f'generation_config: {generation_config}')
ifargs.eval_human:
whileTrue:
instruction=input('<<< ')
data= {'instruction': instruction}
input_ids=tokenize_func(data)['input_ids']
inference(input_ids, model, tokenizer, streamer, generation_config)
print('-'*80)
else:
dataset=get_dataset(args.dataset.split(','))
_, test_dataset=process_dataset(dataset, args.dataset_test_size,
args.dataset_sample,
args.dataset_seed)
mini_test_dataset=test_dataset.select(range(10))
deldatasetfordatainmini_test_dataset:
output=data['output']
data['output'] =Noneinput_ids=tokenize_func(data)['input_ids']
inference(input_ids, model, tokenizer, streamer, generation_config)
print()
print(f'[LABELS]{output}')
print('-'*80)
# input('next[ENTER]')if__name__=='__main__':
args, remaining_argv=parse_args(InferArguments)
iflen(remaining_argv) >0:
ifargs.ignore_args_error:
logger.warning(f'remaining_argv: {remaining_argv}')
else:
raiseValueError(f'remaining_argv: {remaining_argv}')
llm_infer(args)


模型体验链接:https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary

相关文章
|
3天前
|
机器学习/深度学习 人工智能 自然语言处理
云上一键部署通义千问 QwQ-32B 模型,阿里云 PAI 最佳实践
3月6日阿里云发布并开源了全新推理模型通义千问 QwQ-32B,在一系列权威基准测试中,千问QwQ-32B模型表现异常出色,几乎完全超越了OpenAI-o1-mini,性能比肩Deepseek-R1,且部署成本大幅降低。并集成了与智能体 Agent 相关的能力,够在使用工具的同时进行批判性思考,并根据环境反馈调整推理过程。阿里云人工智能平台 PAI-Model Gallery 现已经支持一键部署 QwQ-32B,本实践带您部署体验专属 QwQ-32B模型服务。
|
4天前
|
机器学习/深度学习 人工智能 机器人
阿里通义开源推理模型新王者!QwQ-32B:性能直逼671B的DeepSeek-R1
QwQ-32B 是阿里巴巴开源的新型推理模型,基于强化学习训练,具备强大的数学推理和编程能力,性能媲美更大参数量的模型。
267 8
阿里通义开源推理模型新王者!QwQ-32B:性能直逼671B的DeepSeek-R1
|
4天前
|
机器学习/深度学习 自然语言处理 测试技术
模型上新!来通义灵码体验 QwQ-32B 推理模型!
今天,阿里云发布并开源全新的推理模型通义千问QwQ-32B。通过大规模强化学习,千问QwQ-32B在数学、代码及通用能力上实现质的飞跃,整体性能比肩DeepSeek-R1。在保持强劲性能的同时,千问QwQ-32B还大幅降低了部署使用成本,在消费级显卡上也能实现本地部署。
|
7天前
|
人工智能 自然语言处理 测试技术
通义灵码上新推理模型,快来体验数学编程双冠王 Qwen2.5-Max
近日,通义灵码上新模型选择功能,除新增 DeepSeek 满血版 V3 和 R1 外,Qwen2.5-Max 也正式上线,它使用了超过 20 万亿 token 的预训练数据及精心设计的后训练方案进行训练。
|
9天前
|
人工智能 编解码 API
刚刚,通义万相模型能力重磅升级!
刚刚,通义万相模型能力重磅升级!
|
3天前
|
人工智能 IDE Java
寻找通义灵码 AI 程序员 {头号玩家} ,体验 QwQ-Plus、DeepSeek 满血版的通义灵码
通义灵码联合 CHERRY 中国全网发起寻找 AI 程序员 {头号玩家},体验全新模型加持下的 AI 程序员的智能编码新功能,体验图生代码 Agent、单元测试 Agent 、跨语言编程等 AI 程序员能力,赢取通义灵码 X CHERRY 联名定制个人签名款机械键盘 、CHERRY MX8.3 旗舰级机械键盘、CHERRY 无线双模鼠标、码力全开蛇皮袋等奖品!
|
3天前
|
人工智能 自然语言处理 程序员
用通义灵码开发一个Python时钟:手把手体验AI程序员加持下的智能编码
通义灵码是基于通义大模型的AI研发辅助工具,提供代码智能生成、研发问答、多文件修改等功能,帮助开发者提高编码效率。本文通过手把手教程,使用通义灵码开发一个简单的Python时钟程序,展示其高效、智能的编码体验。从环境准备到代码优化,通义灵码显著降低了开发门槛,提升了开发效率,适合新手和资深开发者。最终,你将体验到AI加持下的便捷与强大功能。
|
6天前
|
人工智能 运维 自然语言处理
通义灵码 AI实战《手把手教你用通义灵码写一个音乐电子小闹钟》
通义灵码DeepSeek版本相比qwen2.5,增强了深度思考和上下文理解能力,显著提升了开发效率,尤其适合代码能力较弱的运维人员,真正实现了“代码即服务”。
90 1
|
6天前
|
人工智能 Java 测试技术
通义灵码2.0·AI程序员加持下的智能编码实践与测评
通义灵码2.0是阿里云推出的新一代智能编程助手,集成DeepSeek模型并新增多项功能,显著提升开发效率。本文通过实际项目体验新功能开发、跨语言编程、单元测试自动生成和图生代码等功能,展示其在代码生成、质量内建和人机协作方面的优势。相比1.0版本,2.0在模型选择、代码质量和用户体验上均有显著提升。尽管存在依赖网络和多语言混合项目中的不足,但整体表现优异,极大优化了开发流程。[了解更多](https://lingma.aliyun.com/)
128 1
|
8天前
|
人工智能 IDE 程序员
通义灵码 AI 程序员正式上线!
通义灵码 AI 程序员正式上线!

热门文章

最新文章