问题描述

使用问答模型 damo/nlp_structbert_faq-question-answering_chinese-base 进行训练时，参照官方文档代码出现如下错误

** build_dataset error log: 'structbert is not in the custom_datasets registry group faq-question-answering. Please make sure the correct version of ModelScope library is used.'

官方地址

模型地址

使用的数据集格式如下

text	label	answer
数据采集	1	数据采集
采集数据	1	数据采集
数据收集	1	数据采集
开始采集	1	数据采集
采集开始	1	数据采集
收集数据	1	数据采集
数据收集	1	数据采集
问题反馈	2	问题反馈
反馈问题	2	问题反馈
上报问题	2	问题反馈
问题上报	2	问题反馈
开始反馈	2	问题反馈
反馈开始	2	问题反馈
汇报工作	3	工作汇报
工作汇报	3	工作汇报
工作上报	3	工作汇报
上报工作	3	工作汇报
开始汇报	3	工作汇报
汇报开始	3	工作汇报
工作填报	4	工作填报
填报工作	4	工作填报
开始填报	4	工作填报
工作填报	4	工作填报

调试完整代码


import os
from modelscope.metainfo import Trainers
from modelscope.msdatasets import MsDataset
from modelscope.pipelines import pipeline
from modelscope.trainers import build_trainer
from modelscope.utils.config import Config
from modelscope.utils.hub import read_config
  
train_dataset = MsDataset.load("./qa.csv", split='train').remap_columns({'text': 'text'})
print(train_dataset)
eval_dataset = train_dataset
cfg: Config = read_config("damo/nlp_structbert_faq-question-answering_chinese-base")
cfg.train.train_iters_per_epoch = 30
cfg.evaluation.val_iters_per_epoch = 2
cfg.train.seed = 1234
cfg.train.optimizer.lr = 2e-5
cfg.train.hooks = [{
    'type': 'CheckpointHook',
    'by_epoch': False,
    'interval': 50
}, {
    'type': 'EvaluationHook',
    'by_epoch': False,
    'interval': 50
}, {
    'type': 'TextLoggerHook',
    'by_epoch': False,
    'rounding_digits': 5,
    'interval': 10
}]
cfg_file = os.path.join("./model/temp", 'config.json')
cfg.dump(cfg_file)

trainer = build_trainer(
    Trainers.faq_question_answering_trainer,
    default_args=dict(
        model="damo/nlp_structbert_faq-question-answering_chinese-base",
        work_dir="./model/temp",
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        cfg_file=cfg_file))

trainer.train()

evaluate_result = trainer.evaluate()
print(evaluate_result)

完整错误信息

Dataset({
    features: ['text', 'label', 'answer'],
    num_rows: 26
})
2023-06-26 20:32:21,659 - modelscope - INFO - initialize model from ./model/damo/nlp_structbert_faq-question-answering_chinese-base
2023-06-26 20:32:22,949 - modelscope - INFO - faq task build protonet network
2023-06-26 20:32:28,135 - modelscope - INFO - All model checkpoint weights were used when initializing SbertForFaqQuestionAnswering.

2023-06-26 20:32:28,136 - modelscope - INFO - All the weights of SbertForFaqQuestionAnswering were initialized from the model checkpoint If your task is similar to the task the model of the checkpoint was trained on, you can already use SbertForFaqQuestionAnswering for predictions without further training.
2023-06-26 20:32:28,137 - modelscope - WARNING - No train key and type key found in preprocessor domain of configuration.json file.
2023-06-26 20:32:28,138 - modelscope - WARNING - Cannot find available config to build preprocessor at mode train, current config: {'max_seq_length': 50, 'model_dir': './model/damo/nlp_structbert_faq-question-answering_chinese-base'}. trying to build by task and model information.
2023-06-26 20:32:28,172 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-06-26 20:32:28,172 - modelscope - WARNING - Cannot find available config to build preprocessor at mode eval, current config: {'max_seq_length': 50, 'model_dir': './model/damo/nlp_structbert_faq-question-answering_chinese-base'}. trying to build by task and model information.
2023-06-26 20:32:28,185 - modelscope - WARNING - ('CUSTOM_DATASETS', 'faq-question-answering', 'structbert') not found in ast index file
2023-06-26 20:32:28,186 - modelscope - WARNING - ('CUSTOM_DATASETS', 'faq-question-answering', 'structbert') not found in ast index file
2023-06-26 20:32:28,187 - modelscope - INFO - cuda is not available, using cpu instead.
2023-06-26 20:32:28,187 - modelscope - INFO - ==========================Training Config Start==========================
2023-06-26 20:32:28,187 - modelscope - INFO - {
    "framework": "pytorch",
    "task": "faq-question-answering",
    "pipeline": {
        "type": "faq-question-answering"
    },
    "model": {
        "type": "structbert",
        "pooling": "avg",
        "metric": "relation"
    },
    "preprocessor": {
        "max_seq_length": 50,
        "model_dir": "./model/damo/nlp_structbert_faq-question-answering_chinese-base"
    },
    "train": {
        "seed": 1234,
        "hooks": [
            {
                "type": "IterTimerHook"
            }
        ],
        "train_iters_per_epoch": 30,
        "max_epochs": 1,
        "sampler": {
            "n_way": 5,
            "k_shot": 5,
            "r_query": 5,
            "min_labels": 2
        },
        "optimizer": {
            "type": "Adam",
            "lr": 2e-05,
            "options": {
                "grad_clip": {
                    "max_norm": 5.0
                }
            }
        },
        "lr_scheduler": {
            "type": "LinearLR",
            "options": {
                "by_epoch": false
            }
        },
        "dataloader": {
            "workers_per_gpu": 1
        },
        "checkpoint": {
            "period": {
                "by_epoch": false,
                "interval": 50
            }
        },
        "logging": {
            "by_epoch": false,
            "rounding_digits": 5,
            "interval": 10
        },
        "work_dir": "./model/temp"
    },
    "evaluation": {
        "metrics": "seq-cls-metric",
        "val_iters_per_epoch": 2,
        "dataloader": {
            "workers_per_gpu": 1
        },
        "period": {
            "by_epoch": false,
            "interval": 50
        }
    }
}
2023-06-26 20:32:28,188 - modelscope - INFO - ===========================Training Config End===========================
2023-06-26 20:32:28,190 - modelscope - INFO - num. of bad sample ids:5/26
2023-06-26 20:32:28,192 - modelscope - INFO - train: label size:3.0, data size:18,                 domain_size:1
2023-06-26 20:32:28,193 - modelscope - WARNING - ('OPTIMIZER', 'default', 'Adam') not found in ast index file
2023-06-26 20:32:28,194 - modelscope - WARNING - ('LR_SCHEDULER', 'default', 'LinearLR') not found in ast index file
2023-06-26 20:32:28,194 - modelscope - INFO - Stage: before_run:
    (ABOVE_NORMAL) OptimizerHook                      
    (LOW         ) LrSchedulerHook                    
    (LOW         ) CheckpointHook                     
    (VERY_LOW    ) TextLoggerHook                     
 -------------------- 
Stage: before_train_epoch:
    (LOW         ) LrSchedulerHook                    
 -------------------- 
Stage: before_train_iter:
    (ABOVE_NORMAL) OptimizerHook                      
 -------------------- 
Stage: after_train_iter:
    (ABOVE_NORMAL) OptimizerHook                      
    (NORMAL      ) EvaluationHook                     
    (LOW         ) LrSchedulerHook                    
    (LOW         ) CheckpointHook                     
    (VERY_LOW    ) TextLoggerHook                     
 -------------------- 
Stage: after_train_epoch:
    (NORMAL      ) EvaluationHook                     
    (LOW         ) LrSchedulerHook                    
    (LOW         ) CheckpointHook                     
    (VERY_LOW    ) TextLoggerHook                     
 -------------------- 
Stage: after_val_epoch:
    (VERY_LOW    ) TextLoggerHook                     
 -------------------- 
Stage: after_run:
    (LOW         ) CheckpointHook                     
 -------------------- 
2023-06-26 20:32:28,197 - modelscope - INFO - Checkpoints will be saved to ./model/temp
2023-06-26 20:32:28,197 - modelscope - INFO - Text logs will be saved to ./model/temp
** build_dataset error log: 'structbert is not in the custom_datasets registry group faq-question-answering. Please make sure the correct version of ModelScope library is used.'
** build_dataset error log: 'structbert is not in the custom_datasets registry group faq-question-answering. Please make sure the correct version of ModelScope library is used.'

使用StructBERT FAQ问答模型进行训练时报错，各位大佬进来看看，帮忙解决解决，十分感谢

问题描述

官方地址

使用的数据集格式如下

调试完整代码

完整错误信息

自然语言处理

相关文章

热门讨论

热门文章

使用StructBERT FAQ问答 模型进行训练时报错，各位大佬进来看看，帮忙解决解决，十分感谢

问题描述

官方地址

使用的数据集格式 如下

调试完整代码

完整错误信息

自然语言处理

相关文章

热门讨论

热门文章

使用StructBERT FAQ问答模型进行训练时报错，各位大佬进来看看，帮忙解决解决，十分感谢

使用的数据集格式如下