近期,Facebook母公司Meta发布其首个开源可商用大语言模型LLama 2。据悉,LLama 2是Meta大语言模型Llama模型的最新商用版本,也是Meta首个免费商用的大语言模型。
魔搭社区第一时间针对LLama 2 系列模型的推理和微调做了适配,让开发者可以第一时间玩起来Llama 2。
环境配置与安装
1. 本文可在单卡3090的环境配置下运行 (显存要求16G)
2. python>=3.8
服务器连接与环境准备
# 服务器连接 (CentOS) ssh root@xxx.xxx.xxx.xxx # 可通过vscode连接 passwd # 修改root密码 lsb_release -a # 查看操作系统版本 # 安装git并配置 yum install git git config --global user.name "llama2" git config --global user.email "llama2@abc.com" git config --global init.defaultBranch main git config --list # 创建用户, 并设置密码(当然你也可以在root下操作) useradd -d /home/llama2 -m llama2 passwd llama2 su llama2 # 安装miniconda wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh # 一直[ENTER], 最后一个选项yes即可 sh Miniconda3-latest-Linux-x86_64.sh # conda虚拟环境搭建 conda create --name modelscope python=3.10 conda activate modelscope # pip设置全局镜像与相关python包安装 pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ pip install numpy pandas matplotlib scikit-learn pip install transformers datasets -U conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia pip install tqdm tensorboard torchmetrics sentencepiece charset_normalizer pip install accelerate transformers_stream_generator # 安装最新版modelscope pip install "modelscope==1.7.2rc0" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html # Resolve torchmetrics dependencies and update numpy pip install numpy -U
git clone ModelScope,运行示例代码
git clone https://github.com/modelscope/modelscope.git
进入python环境,获取环境基本信息
# https://github.com/modelscope/modelscope/blob/master/examples/pytorch/llm/_common.py from _common import * device_ids = [0, 1] logger.info(device_ids) select_device(device_ids) seed_everything(42)
模型链接和下载
LLama 2系列模型现已在ModelScope社区开源,包括:
LLaMA-2-7B
模型链接:https://modelscope.cn/models/modelscope/Llama-2-7b-ms/summary
LLaMA-2-7B-chat
模型链接:https://modelscope.cn/models/modelscope/Llama-2-7b-chat-ms/summary
更多的LLaMa-2系列模型,社区上架中~~
社区支持直接下载模型的repo. 通过如下代码,实现模型下载,以及load model, tokenizer:
# ### Loading Model and Tokenizer model_dir = snapshot_download('modelscope/Llama-2-7b-ms', 'v1.0.1') model, tokenizer = get_llama2_model_tokenizer(model_dir)
模型推理
LLaMA-2-7B推理代码
from modelscope import snapshot_download, Model from modelscope.models.nlp.llama2 import Llama2TokenizerFast model_dir = snapshot_download("modelscope/Llama-2-7b-ms", 'v1.0.1') model = Model.from_pretrained(model_dir) tokenizer = Llama2TokenizerFast.from_pretrained(model_dir) prompt = "Hey, are you conscious? Can you talk to me?" inputs = tokenizer(prompt, return_tensors="pt") # Generate generate_ids = model.generate(inputs.input_ids, max_length=30) print(tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])
数据集链接和下载
这里使用alpaca-gpt4-data-zh,alpaca-gpt4-data-en作为指令微调数据集(保证代码的完整性)。
from modelscope import MsDataset dataset_zh = MsDataset.load("AI-ModelScope/alpaca-gpt4-data-zh", split="train") dataset_en = MsDataset.load("AI-ModelScope/alpaca-gpt4-data-en", split="train") print(len(dataset_zh["instruction"])) print(len(dataset_en["instruction"])) print(dataset_zh[0]) """Out 48818 52002 {'instruction': '保持健康的三个提示。', 'input': None, 'output': '以下是保持健康的三个提示:\n\n1. 保持身体活动。每天做适当的身体运动,如散步、跑步或游泳,能促进心血管健康,增强肌肉力量,并有助于减少体重。\n\n2. 均衡饮食。每天食用新鲜的蔬菜、水果、全谷物和脂肪含量低的蛋白质食物,避免高糖、高脂肪和加工食品,以保持健康的饮食习惯。\n\n3. 睡眠充足。睡眠对人体健康至关重要,成年人每天应保证 7-8 小时的睡眠。良好的睡眠有助于减轻压力,促进身体恢复,并提高注意力和记忆力。'} """
模型训练最佳实践
开源代码:
https://github.com/modelscope/modelscope/blob/master/examples/pytorch/llm/llm_sft.py
微调过程分为如下几步:
- 准备环境
- 使用modelscope下载模型, 并导入
- 使用modelscope提供的微调方法构建最终模型
- 使用modelscope提供的Trainer对模型进行微调
准备环境及命令后参数导入:
from _common import * @dataclass class Arguments: device: str = '0,1' # e.g. '-1'; '0'; '0,1' seed: int = 42 model_type: str = field( default='llama2-7b', metadata={ 'choices': ['baichuan-7b', 'baichuan-13b', 'chatglm2', 'llama2-7b'] }) data_sample: Optional[int] = None # lora_target_modules: Optional[List[str]] = None lora_rank: int = 8 lora_alpha: int = 32 lora_dropout_p: float = 0.1 # gradient_checkpoint: bool = True batch_size: int = 1 max_epochs: int = 1 eval_interval: int = 500 learning_rate: float = 1e-4 weight_decay: float = 0.01 n_accumulate_grad: int = 16 grad_clip_norm: float = 1. warmup_iters: int = 200 last_max_checkpoint_num: int = 1 best_max_checkpoint_num: int = 1 # logging_interval: int = 5 tb_interval: int = 5 def __post_init__(self): if self.lora_target_modules is None: if self.model_type in {'baichuan-7b', 'baichuan-13b'}: self.lora_target_modules = ['W_pack'] elif self.model_type == 'chatglm2': self.lora_target_modules = ['query_key_value'] elif self.model_type == 'llama2-7b': self.lora_target_modules = ['q_proj', 'k_proj', 'v_proj'] else: raise ValueError(f'model_type: {self.model_type}') def parse_args() -> Arguments: args, = HfArgumentParser([Arguments]).parse_args_into_dataclasses() return args args = parse_args() logger.info(args) select_device(args.device) seed_everything(args.seed)
导入模型:
# ### Loading Model and Tokenizer if args.model_type == 'baichuan-7b': model_dir = snapshot_download('baichuan-inc/baichuan-7B', 'v1.0.5') model, tokenizer = get_baichuan_model_tokenizer(model_dir) elif args.model_type == 'baichuan-13b': model_dir = snapshot_download('baichuan-inc/Baichuan-13B-Base', 'v1.0.2') model, tokenizer = get_baichuan_model_tokenizer(model_dir) elif args.model_type == 'chatglm2': model_dir = snapshot_download('ZhipuAI/chatglm2-6b', 'v1.0.6') model, tokenizer = get_chatglm2_model_tokenizer(model_dir) elif args.model_type == 'llama2-7b': model_dir = snapshot_download('modelscope/Llama-2-7b-ms', 'v1.0.0') model, tokenizer = get_llama2_model_tokenizer(model_dir) else: raise ValueError(f'model_type: {args.model_type}') # if args.gradient_checkpoint: # baichuan-13b does not implement the `get_input_embeddings` function if args.model_type == 'baichuan-13b': def get_input_embeddings(self): return self.model.embed_tokens model.__class__.get_input_embeddings = get_input_embeddings.__get__( model) model.gradient_checkpointing_enable() model.enable_input_require_grads()
准备LoRA:
# ### Preparing lora lora_config = LoRAConfig( replace_modules=args.lora_target_modules, rank=args.lora_rank, lora_alpha=args.lora_alpha, lora_dropout=args.lora_dropout_p) logger.info(f'lora_config: {lora_config}') Swift.prepare_model(model, lora_config) # show_freeze_layers(model) print_model_info(model) _p: Parameter = list(model.parameters())[100] logger.info(f'device: {_p.device}, dtype: {_p.dtype}') model.bfloat16()
导入datasets:
# ### Loading Dataset tokenize_function = partial(tokenize_function, tokenizer=tokenizer) train_dataset, val_dataset = get_alpaca_en_zh_dataset( tokenize_function, split_seed=42, data_sample=args.data_sample) # Data analysis stat_dataset(train_dataset) stat_dataset(val_dataset) data_collate_fn = partial(data_collate_fn, tokenizer=tokenizer) print_example(train_dataset[0], tokenizer)