我的模型是在这里下载的
下面是我的微调代码,模型路径我指定下载的文件夹
import tempfile
import time
from modelscope.msdatasets import MsDataset
from modelscope.metainfo import Trainers
from modelscope.trainers import build_trainer
dataset = MsDataset.load('文案1.json')
data = dataset
print(data)
print(next(iter(data)))
train_dataset = data.remap_columns({'商品名称': 'src_txt', '文案al': 'tgt_txt'})
eval_dataset = data.remap_columns({'商品名称': 'src_txt', '文案al': 'tgt_txt'})
print(train_dataset)
for i, item in enumerate(train_dataset):
if i < 5:
print(item)
else:
break
num_warmup_steps = 500
def noam_lambda(current_step: int):
current_step += 1
return min(current_step(-0.5),
current_step * num_warmup_steps(-1.5))
def cfg_modify_fn(cfg):
cfg.train.lr_scheduler = {
'type': 'LambdaLR',
'lr_lambda': noam_lambda,
'options': {
'by_epoch': False
}
}
cfg.train.optimizer = {
"type": "AdamW",
"lr": 1e-3,
"options": {}
}
cfg.train.max_epochs = 15
cfg.train.dataloader = {
"batch_size_per_gpu": 8,
"workers_per_gpu": 1
}
return cfg
kwargs = dict(
model='nlp_palm2.0_text-generation_commodity_chinese-base',
train_dataset=train_dataset,
eval_dataset=eval_dataset,
work_dir=tempfile.TemporaryDirectory().name,
cfg_modify_fn=cfg_modify_fn)
trainer = build_trainer(
name=Trainers.text_generation_trainer, default_args=kwargs)
trainer.train()
但是运行后报错
2024-07-27 20:11:22,544 - modelscope - INFO - Model file README.md is different from the latest version v1.0.0
,This is because you are using an older version or the file is updated manually.
2024-07-27 20:11:22,546 - modelscope - INFO - initialize model from nlp_palm2.0_text-generation_commodity_chinese-base
2024-07-27 20:11:27,927 - modelscope - WARNING - ('CUSTOM_DATASETS', 'text-generation', 'palm-v2') not found in ast index file
2024-07-27 20:11:27,928 - modelscope - WARNING - ('CUSTOM_DATASETS', 'text-generation', 'palm-v2') not found in ast index file
2024-07-27 20:11:27,931 - modelscope - INFO - cuda is not available, using cpu instead.
2024-07-27 20:11:27,932 - modelscope - INFO - ==========================Training Config Start==========================
2024-07-27 20:11:27,933 - modelscope - INFO - {
"framework": "pytorch",
"task": "text-generation",
"model": {
"type": "palm-v2"
},
"pipeline": {
"type": "text-generation"
},
"preprocessor": {
"type": "text-gen-tokenizer",
"mode": "eval",
"use_fast": true
},
"train": {
"hooks": [
{
"type": "IterTimerHook"
}
],
"lr_scheduler": {
"type": "LambdaLR",
"lr_lambda": null,
"options": {
"by_epoch": false
}
},
"optimizer": {
"type": "AdamW",
"lr": 0.001,
"options": {}
},
"max_epochs": 15,
"dataloader": {
"batch_size_per_gpu": 8,
"workers_per_gpu": 1
},
"checkpoint": {
"period": {
"interval": 1,
"save_dir": "/tmp/tmpczm0yrho"
}
},
"logging": {
"interval": 10,
"out_dir": "/tmp/tmpczm0yrho"
},
"work_dir": "/tmp/tmpczm0yrho"
}
}
2024-07-27 20:11:27,933 - modelscope - INFO - ===========================Training Config End===========================
2024-07-27 20:11:27,934 - modelscope - WARNING - ('OPTIMIZER', 'default', 'AdamW') not found in ast index file
build_dataset error log: 'palm-v2 is not in the custom_datasets registry group text-generation. Please make sure the correct version of ModelScope library is used.'
build_dataset error log: 'palm-v2 is not in the custom_datasets registry group text-generation. Please make sure the correct version of ModelScope library is used.'
ModelScope旨在打造下一代开源的模型即服务共享平台,为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品,让模型应用更简单!欢迎加入技术交流群:微信公众号:魔搭ModelScope社区,钉钉群号:44837352