运行环境,警告信息和解决方案:
Linux环境,Python3,PyTorch版本为1.8.1,transformers包版本为4.12.5。
代码:
from transformers import AutoTokenizer,AutoModel pretrained_path="mypath/bert-base-chinese" tokenizer=AutoTokenizer.from_pretrained(pretrained_path) encoder=AutoModel.from_pretrained(pretrained_path)
(模型文件下载自:bert-base-chinese · Hugging Face)
警告信息:
Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertModel: ['cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias'] - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
这是个警告信息,不是报错信息。这只说明对应的加载的预训练模型与任务类型不完全对应。如这个模型的架构是BertForMaskedLM,因此用BERT在别的任务上的model类来调用该预训练模型时,要么出现有些参数用不到的情况(如本例),要么出现有些参数没有、需要随机初始化的情况。
本例由于我只想输出transformer模型的last hidden state,因此用不到警告信息中所说的这些分类参数。
如果你想直接删除这个信息,可以使用:
from transformers import logging logging.set_verbosity_warning()
或:
from transformers import logging logging.set_verbosity_error()
以下介绍一些相关的知识点:
理论上应该完全匹配(config.json中给出的architectures就是BertForMaskedLM),但是仍会显示有些参数用不到:
from transformers import AutoTokenizer,BertForMaskedLM pretrained_path="mypath/bert-base-chinese" tokenizer=AutoTokenizer.from_pretrained(pretrained_path) encoder=BertForMaskedLM.from_pretrained(pretrained_path)
警告信息:
Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight'] - This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
参考https://github.com/huggingface/transformers/issues/5421#issuecomment-717245807,应该是因为官方的BERT模型有两个预测头(MLM和NSP),所以MLM任务上的模型没有加载NSP的预测头。
有新的参数需要随机初始化的情况(AutoModelForSequenceClassification这类属于需要在原始模型的基础上进行微调的模型类,对其进一步的了解可参考我撰写的另一篇博文:用huggingface.transformers.AutoModelForSequenceClassification在文本分类任务上微调预训练模型_诸神缄默不语的博客-CSDN博客_huggingface transformers微调):
from transformers import AutoConfig,AutoTokenizer,AutoModelForSequenceClassification model_path="mypath/bert-base-chinese" config=AutoConfig.from_pretrained(model_path,num_labels=5) tokenizer=AutoTokenizer.from_pretrained(model_path) encoder=AutoModelForSequenceClassification.from_pretrained(model_path,config=config)
警告信息:
Some weights of the model checkpoint at mypath/bert-base-chinese were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight'] - This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of BertForSequenceClassification were not initialized from the model checkpoint at mypath/bert-base-chinese and are newly initialized: ['classifier.weight', 'classifier.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.