Transformers 4.37 中文文档（十三）（6）-阿里云开发者社区

Transformers 4.37 中文文档（十三）（5）https://developer.aliyun.com/article/1564950

TFAutoModelForPreTraining

`class transformers.TFAutoModelForPreTraining`

( *args **kwargs )

这是一个通用的模型类，当使用 class method 或 class method 创建时，将作为库的模型类之一实例化（带有预训练头）。

这个类不能直接使用__init__()实例化（会抛出错误）。

`from_config`

<来源>

( **kwargs )

参数

config（PretrainedConfig）— 根据配置类选择要实例化的模型类：

AlbertConfig 配置类：TFAlbertForPreTraining（ALBERT 模型）
BartConfig 配置类：TFBartForConditionalGeneration（BART 模型）
BertConfig 配置类：TFBertForPreTraining（BERT 模型）
CTRLConfig 配置类：TFCTRLLMHeadModel（CTRL 模型）
CamembertConfig 配置类：TFCamembertForMaskedLM（CamemBERT 模型）
DistilBertConfig 配置类：TFDistilBertForMaskedLM（DistilBERT 模型）
ElectraConfig 配置类：TFElectraForPreTraining（ELECTRA 模型）
FlaubertConfig 配置类：TFFlaubertWithLMHeadModel（FlauBERT 模型）
FunnelConfig 配置类：TFFunnelForPreTraining（漏斗 Transformer 模型）
GPT2Config 配置类：TFGPT2LMHeadModel（OpenAI GPT-2 模型）
LayoutLMConfig 配置类：TFLayoutLMForMaskedLM（LayoutLM 模型）
LxmertConfig 配置类：TFLxmertForPreTraining（LXMERT 模型）
MPNetConfig 配置类：TFMPNetForMaskedLM（MPNet 模型）
MobileBertConfig 配置类：TFMobileBertForPreTraining（MobileBERT 模型）
OpenAIGPTConfig 配置类：TFOpenAIGPTLMHeadModel（OpenAI GPT 模型）
RobertaConfig 配置类：TFRobertaForMaskedLM（RoBERTa 模型）
RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForMaskedLM（RoBERTa-PreLayerNorm 模型）
T5Config 配置类：TFT5ForConditionalGeneration（T5 模型）
TapasConfig 配置类：TFTapasForMaskedLM（TAPAS 模型）
TransfoXLConfig 配置类：TFTransfoXLLMHeadModel（Transformer-XL 模型）
ViTMAEConfig 配置类：TFViTMAEForPreTraining（ViTMAE 模型）
XLMConfig 配置类：TFXLMWithLMHeadModel（XLM 模型）
XLMRobertaConfig 配置类：TFXLMRobertaForMaskedLM（XLM-RoBERTa 模型）
XLNetConfig 配置类：TFXLNetLMHeadModel（XLNet 模型）

从配置实例化库中的一个模型类（带有预训练头）。

注意：从配置文件加载模型不会加载模型权重。它只影响模型的配置。使用 from_pretrained() 来加载模型权重。

示例：

>>> from transformers import AutoConfig, TFAutoModelForPreTraining
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)

`from_pretrained`

< source >

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：

一个字符串，预训练模型的 模型 id，托管在 huggingface.co 上的模型存储库中。有效的模型 id 可以位于根级别，如 bert-base-uncased，或者在用户或组织名称下命名空间化，如 dbmdz/bert-base-german-cased。
一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如，./my_model_directory/。
路径或 URL 指向 PyTorch state_dict 保存文件（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型并随后加载 TensorFlow 模型要慢。

model_args（额外的位置参数，optional） — 将传递给底层模型 __init__() 方法。
config（PretrainedConfig，optional） — 用于模型的配置，而不是自动加载的配置。当以下情况发生时，配置可以自动加载：

该模型是库提供的模型（使用预训练模型的 模型 ID 字符串加载）。
该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
通过提供本地目录作为pretrained_model_name_or_path并在目录中找到名为 config.json 的配置 JSON 文件加载模型。

cache_dir (str or os.PathLike, optional) — 如果不使用标准缓存，则应将下载的预训练模型配置缓存在其中的目录路径。
from_pt (bool, optional, defaults to False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅pretrained_model_name_or_path参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制下载（重新下载）模型权重和配置文件，覆盖缓存版本（如果存在）。
resume_download (bool, optional, defaults to False) — 是否删除接收不完整的文件。如果存在这样的文件，将尝试恢复下载。
proxies (Dict[str, str], optional) — 一个代理服务器字典，按协议或端点使用，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求中使用。
output_loading_info(bool, optional, defaults to False) — 是否还返回包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否仅查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统存储模型和其他工件，所以revision可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上定义自定义模型的代码文件。此选项应仅对您信任的存储库设置为True，并且您已经阅读了代码，因为它将在本地机器上执行 Hub 上存在的代码。
code_revision (str, optional, defaults to "main") — 用于 Hub 上代码的特定修订版本，如果代码存储在与模型其余部分不同的存储库中。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统存储模型和其他工件，所以revision可以是 git 允许的任何标识符。
kwargs（额外的关键字参数，optional） — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。根据是否提供或自动加载了config，行为会有所不同：

如果提供了config，**kwargs将直接传递给底层模型的__init__方法（我们假设配置的所有相关更新已经完成）
如果未提供配置，kwargs 将首先传递给配置类初始化函数（from_pretrained()）。kwargs 的每个对应配置属性的键将用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给基础模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有预训练头）。

要实例化的模型类是根据配置对象的 model_type 属性选择的（作为参数传递或从 pretrained_model_name_or_path 加载，如果可能的话），或者当缺少时，通过在 pretrained_model_name_or_path 上使用模式匹配来回退：

albert — TFAlbertForPreTraining (ALBERT 模型)
bart — TFBartForConditionalGeneration (BART 模型)
bert — TFBertForPreTraining (BERT 模型)
camembert — TFCamembertForMaskedLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
distilbert — TFDistilBertForMaskedLM (DistilBERT 模型)
electra — TFElectraForPreTraining (ELECTRA 模型)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT 模型)
funnel — TFFunnelForPreTraining (Funnel Transformer 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM 模型)
lxmert — TFLxmertForPreTraining (LXMERT 模型)
mobilebert — TFMobileBertForPreTraining (MobileBERT 模型)
mpnet — TFMPNetForMaskedLM (MPNet 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
roberta — TFRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
t5 — TFT5ForConditionalGeneration (T5 模型)
tapas — TFTapasForMaskedLM (TAPAS 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
vit_mae — TFViTMAEForPreTraining (ViTMAE 模型)
xlm — TFXLMWithLMHeadModel（XLM 模型）
xlm-roberta — TFXLMRobertaForMaskedLM（XLM-RoBERTa 模型）
xlnet — TFXLNetLMHeadModel（XLNet 模型）

示例：

>>> from transformers import AutoConfig, TFAutoModelForPreTraining
>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForPreTraining

`class transformers.FlaxAutoModelForPreTraining`

< source >

( *args **kwargs )

这是一个通用的模型类，当使用 class method 或 class method 创建时，将作为库的模型类之一实例化（带有预训练头）。

这个类不能直接使用__init__()实例化（会抛出错误）。

`from_config`

< source >

( **kwargs )

参数

config（PretrainedConfig） — 选择要实例化的模型类基于配置类：

AlbertConfig 配置类：FlaxAlbertForPreTraining（ALBERT 模型）
BartConfig 配置类：FlaxBartForConditionalGeneration（BART 模型）
BertConfig 配置类：FlaxBertForPreTraining（BERT 模型）
BigBirdConfig 配置类：FlaxBigBirdForPreTraining（BigBird 模型）
ElectraConfig 配置类：FlaxElectraForPreTraining（ELECTRA 模型）
LongT5Config 配置类：FlaxLongT5ForConditionalGeneration（LongT5 模型）
MBartConfig 配置类：FlaxMBartForConditionalGeneration（mBART 模型）
MT5Config 配置类：FlaxMT5ForConditionalGeneration（MT5 模型）
RoFormerConfig 配置类：FlaxRoFormerForMaskedLM（RoFormer 模型）
RobertaConfig 配置类：FlaxRobertaForMaskedLM（RoBERTa 模型）
RobertaPreLayerNormConfig 配置类: FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
T5Config 配置类: FlaxT5ForConditionalGeneration (T5 模型)
Wav2Vec2Config 配置类: FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
WhisperConfig 配置类: FlaxWhisperForConditionalGeneration (Whisper 模型)
XLMRobertaConfig 配置类: FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

从配置实例化库中的一个模型类（带有预训练头）时，可以自动加载配置。

注意: 从配置文件加载模型不会加载模型权重。它只影响模型的配置。使用 from_pretrained() 来加载模型权重。

示例:

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)

`from_pretrained`

< source >

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是:

一个字符串，预训练模型的 model id，托管在 huggingface.co 上的模型存储库内。有效的模型 id 可以位于根级别，如 bert-base-uncased，或者命名空间在用户或组织名称下，如 dbmdz/bert-base-german-cased。
一个包含使用 save_pretrained() 保存的模型权重的目录的路径，例如，./my_model_directory/。
PyTorch state_dict save file 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应将配置对象作为 config 参数提供。使用此加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型并随后加载 TensorFlow 模型要慢。

model_args（额外的位置参数，可选） — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig，可选) — 用于替代自动加载的配置的模型配置。当:

是库提供的模型（使用预训练模型的 model id 字符串加载）。
模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
通过提供本地目录作为 pretrained_model_name_or_path 并在目录中找到名为 config.json 的配置 JSON 文件来加载模型。

cache_dir (str 或 os.PathLike，可选) — 如果不使用标准缓存，则应将下载的预训练模型配置缓存在其中的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制下载模型权重和配置文件，覆盖缓存版本（如果存在）。
resume_download (bool, optional, defaults to False) — 是否删除接收不完整的文件。如果存在这样的文件，将尝试恢复下载。
proxies (Dict[str, str], optional) — 一个代理服务器字典，按协议或端点使用，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, optional, defaults to False) — 是否返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否仅查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上定义自定义模型的建模文件。此选项应仅对您信任的存储库设置为 True，并且您已阅读了代码，因为它将在本地机器上执行 Hub 上存在的代码。
code_revision (str, optional, defaults to "main") — 在 Hub 上使用的特定代码修订版本，如果代码存储在与模型其余部分不同的存储库中。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs（额外的关键字参数，optional） — 可用于更新配置对象（加载后）并启动模型（例如，output_attentions=True）。根据是否提供或自动加载了 config，其行为有所不同：

如果提供了带有 config 的配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设配置的所有相关更新已完成）
如果未提供配置，kwargs 将首先传递给配置类初始化函数（from_pretrained()）。与配置属性对应的 kwargs 的每个键将用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有预训练头）。

根据配置对象的 model_type 属性选择要实例化的模型类（作为参数传递或从 pretrained_model_name_or_path 加载，如果可能），或者当缺失时，通过在 pretrained_model_name_or_path 上使用模式匹配来回退：

albert — FlaxAlbertForPreTraining（ALBERT 模型）
bart — FlaxBartForConditionalGeneration（BART 模型）
bert — FlaxBertForPreTraining（BERT 模型）
big_bird — FlaxBigBirdForPreTraining（BigBird 模型）
electra — FlaxElectraForPreTraining（ELECTRA 模型）
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
mt5 — FlaxMT5ForConditionalGeneration (MT5 模型)
roberta — FlaxRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMaskedLM (RoFormer 模型)
t5 — FlaxT5ForConditionalGeneration (T5 模型)
wav2vec2 — FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
whisper — FlaxWhisperForConditionalGeneration (Whisper 模型)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例:

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining
>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

自然语言处理

以下自动类适用于以下自然语言处理任务。

AutoModelForCausalLM

`class transformers.AutoModelForCausalLM`

< source >

( *args **kwargs )

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，将实例化为库中的一个模型类（带有因果语言建模头）。

这个类不能直接使用 __init__() 实例化（会抛出错误）。

`from_config`

< source >

( **kwargs )

参数

config (PretrainedConfig) — 根据配置类选择要实例化的模型类:

BartConfig 配置类: BartForCausalLM (BART 模型)
BertConfig 配置类: BertLMHeadModel (BERT 模型)
BertGenerationConfig 配置类: BertGenerationDecoder (Bert Generation 模型)
BigBirdConfig 配置类: BigBirdForCausalLM (BigBird 模型)
BigBirdPegasusConfig 配置类: BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
BioGptConfig 配置类: BioGptForCausalLM (BioGpt 模型)
BlenderbotConfig 配置类: BlenderbotForCausalLM (Blenderbot 模型)
BlenderbotSmallConfig 配置类: BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
BloomConfig 配置类: BloomForCausalLM (BLOOM 模型)
CTRLConfig 配置类: CTRLLMHeadModel (CTRL 模型)
CamembertConfig 配置类: CamembertForCausalLM (CamemBERT 模型)
CodeGenConfig 配置类: CodeGenForCausalLM (CodeGen 模型)
CpmAntConfig 配置类: CpmAntForCausalLM (CPM-Ant 模型)
Data2VecTextConfig 配置类: Data2VecTextForCausalLM (Data2VecText 模型)
ElectraConfig 配置类: ElectraForCausalLM (ELECTRA 模型)
ErnieConfig 配置类: ErnieForCausalLM (ERNIE 模型)
FalconConfig 配置类: FalconForCausalLM (Falcon 模型)
FuyuConfig 配置类: FuyuForCausalLM (Fuyu 模型)
GPT2Config 配置类: GPT2LMHeadModel (OpenAI GPT-2 模型)
GPTBigCodeConfig 配置类: GPTBigCodeForCausalLM (GPTBigCode 模型)
GPTJConfig 配置类: GPTJForCausalLM (GPT-J 模型)
GPTNeoConfig 配置类: GPTNeoForCausalLM (GPT Neo 模型)
GPTNeoXConfig 配置类: GPTNeoXForCausalLM (GPT NeoX 模型)
GPTNeoXJapaneseConfig 配置类: GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese 模型)
GitConfig 配置类: GitForCausalLM (GIT 模型)
LlamaConfig 配置类: LlamaForCausalLM (LLaMA 模型)
MBartConfig 配置类: MBartForCausalLM (mBART 模型)
MarianConfig 配置类: MarianForCausalLM (Marian 模型)
MegaConfig 配置类: MegaForCausalLM (MEGA 模型)
MegatronBertConfig 配置类: MegatronBertForCausalLM (Megatron-BERT 模型)
MistralConfig 配置类: MistralForCausalLM (Mistral 模型)
MixtralConfig 配置类: MixtralForCausalLM (Mixtral 模型)
MptConfig 配置类: MptForCausalLM (MPT 模型)
MusicgenConfig 配置类: MusicgenForCausalLM (MusicGen 模型)
MvpConfig 配置类: MvpForCausalLM (MVP 模型)
OPTConfig 配置类: OPTForCausalLM (OPT 模型)
OpenAIGPTConfig 配置类: OpenAIGPTLMHeadModel (OpenAI GPT 模型)
OpenLlamaConfig 配置类: OpenLlamaForCausalLM (OpenLlama 模型)
PLBartConfig 配置类: PLBartForCausalLM (PLBart 模型)
PegasusConfig 配置类: PegasusForCausalLM (Pegasus 模型)
PersimmonConfig 配置类: PersimmonForCausalLM (Persimmon 模型)
PhiConfig 配置类: PhiForCausalLM (Phi 模型)
ProphetNetConfig 配置类: ProphetNetForCausalLM (ProphetNet 模型)
QDQBertConfig 配置类: QDQBertLMHeadModel (QDQBert 模型)
Qwen2Config 配置类: Qwen2ForCausalLM (Qwen2 模型)
ReformerConfig 配置类: ReformerModelWithLMHead (Reformer 模型)
RemBertConfig 配置类: RemBertForCausalLM (RemBERT 模型)
RoCBertConfig 配置类: RoCBertForCausalLM (RoCBert 模型)
RoFormerConfig 配置类: RoFormerForCausalLM (RoFormer 模型)
RobertaConfig 配置类: RobertaForCausalLM (RoBERTa 模型)
RobertaPreLayerNormConfig 配置类: RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
RwkvConfig 配置类: RwkvForCausalLM (RWKV 模型)
Speech2Text2Config 配置类: Speech2Text2ForCausalLM (Speech2Text2 模型)
TrOCRConfig 配置类: TrOCRForCausalLM (TrOCR 模型)
TransfoXLConfig 配置类: TransfoXLLMHeadModel (Transformer-XL 模型)
WhisperConfig 配置类：WhisperForCausalLM（Whisper 模型）
XGLMConfig 配置类：XGLMForCausalLM（XGLM 模型）
XLMConfig 配置类：XLMWithLMHeadModel（XLM 模型）
XLMProphetNetConfig 配置类：XLMProphetNetForCausalLM（XLM-ProphetNet 模型）
XLMRobertaConfig 配置类：XLMRobertaForCausalLM（XLM-RoBERTa 模型）
XLMRobertaXLConfig 配置类：XLMRobertaXLForCausalLM（XLM-RoBERTa-XL 模型）
XLNetConfig 配置类：XLNetLMHeadModel（XLNet 模型）
XmodConfig 配置类：XmodForCausalLM（X-MOD 模型）

从配置实例化库中的一个模型类（带有因果语言建模头）。

注意：从配置文件加载模型不会加载模型权重。它只影响模型的配置。使用 from_pretrained() 来加载模型权重。

示例：

>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)

`from_pretrained`

< source >

( *model_args **kwargs )

参数

pretrained_model_name_or_path（str 或 os.PathLike）- 可以是：

一个字符串，预训练模型的模型 id，托管在 huggingface.co 上的模型仓库中。有效的模型 id 可以位于根级别，如 bert-base-uncased，或者命名空间下的用户或组织名称，如 dbmdz/bert-base-german-cased。
一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如，./my_model_directory/。
一个TensorFlow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并且应提供配置对象作为 config 参数。使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型并加载 PyTorch 模型后，此加载路径比较慢。

model_args（额外的位置参数，可选）- 将传递给底层模型的 __init__() 方法。
config（PretrainedConfig，可选）- 用于替代自动加载的配置的模型配置。当：

模型是库提供的模型（使用预训练模型的模型 id 字符串加载）。
模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并在目录中找到名为 config.json 的配置 JSON 文件。

state_dict (Dict[str, torch.Tensor], optional) — 用于替代从保存的权重文件加载的状态字典。
如果要从预训练配置创建模型但加载自己的权重，则可以使用此选项。但在这种情况下，您应该检查是否使用 save_pretrained() 和 from_pretrained() 不是更简单的选项。
cache_dir (str 或 os.PathLike, optional) — 如果不使用标准缓存，则应将下载的预训练模型配置缓存在其中的目录路径。
from_tf (bool, optional, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖缓存版本（如果存在）。
resume_download (bool, optional, 默认为 False) — 是否删除接收不完整的文件。如果存在这样的文件，将尝试恢复下载。
proxies (Dict[str, str], optional) — 一个代理服务器字典，按协议或端点使用，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。这些代理在每个请求中使用。
output_loading_info(bool, optional, 默认为 False) — 是否返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, 默认为 False) — 是否仅查看本地文件（例如，不尝试下载模型）。
revision (str, optional, 默认为"main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, 默认为 False) — 是否允许在 Hub 上定义自定义模型的建模文件。此选项应仅对您信任的存储库设置为 True，并且您已阅读了代码，因为它将在本地机器上执行 Hub 上存在的代码。
code_revision (str, optional, 默认为"main") — 用于 Hub 上代码的特定修订版本，如果代码位于与模型其余部分不同的存储库中。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, optional) — 可用于更新配置对象（加载后）并启动模型（例如，output_attentions=True）。根据是否提供或自动加载了 config，其行为会有所不同：

如果提供了 config，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设配置的所有相关更新已经完成）
如果未提供配置，则首先将 kwargs 传递给配置类的初始化函数（from_pretrained()）。kwargs 的每个键对应于一个配置属性，将用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有因果语言建模头）。

根据配置对象的 model_type 属性（作为参数传递或从 pretrained_model_name_or_path 加载，如果可能的话），选择要实例化的模型类，或者当缺失时，通过在 pretrained_model_name_or_path 上使用模式匹配来回退：

bart — BartForCausalLM (BART 模型)
bert — BertLMHeadModel (BERT 模型)
bert-generation — BertGenerationDecoder (Bert Generation 模型)
big_bird — BigBirdForCausalLM (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
biogpt — BioGptForCausalLM (BioGpt 模型)
blenderbot — BlenderbotForCausalLM (Blenderbot 模型)
blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
bloom — BloomForCausalLM (BLOOM 模型)
camembert — CamembertForCausalLM (CamemBERT 模型)
code_llama — LlamaForCausalLM (CodeLlama 模型)
codegen — CodeGenForCausalLM (CodeGen 模型)
cpmant — CpmAntForCausalLM (CPM-Ant 模型)
ctrl — CTRLLMHeadModel (CTRL 模型)
data2vec-text — Data2VecTextForCausalLM (Data2VecText 模型)
electra — ElectraForCausalLM (ELECTRA 模型)
ernie — ErnieForCausalLM (ERNIE 模型)
falcon — FalconForCausalLM (Falcon 模型)
fuyu — FuyuForCausalLM (Fuyu 模型)
git — GitForCausalLM (GIT 模型)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode 模型)
gpt_neo — GPTNeoForCausalLM (GPT Neo 模型)
gpt_neox — GPTNeoXForCausalLM (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseForCausalLM (GPT NeoX 日语模型)
gptj — GPTJForCausalLM (GPT-J 模型)
llama — LlamaForCausalLM (LLaMA 模型)
marian — MarianForCausalLM (Marian 模型)
mbart — MBartForCausalLM (mBART 模型)
mega — MegaForCausalLM (MEGA 模型)
megatron-bert — MegatronBertForCausalLM (Megatron-BERT 模型)
mistral — MistralForCausalLM (Mistral 模型)
mixtral — MixtralForCausalLM (Mixtral 模型)
mpt — MptForCausalLM (MPT 模型)
musicgen — MusicgenForCausalLM (MusicGen 模型)
mvp — MvpForCausalLM (MVP 模型)
open-llama — OpenLlamaForCausalLM (OpenLlama 模型)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — OPTForCausalLM (OPT 模型)
pegasus — PegasusForCausalLM (Pegasus 模型)
persimmon — PersimmonForCausalLM (Persimmon 模型)
phi — PhiForCausalLM (Phi 模型)
plbart — PLBartForCausalLM (PLBart 模型)
prophetnet — ProphetNetForCausalLM (ProphetNet 模型)
qdqbert — QDQBertLMHeadModel (QDQBert 模型)
qwen2 — Qwen2ForCausalLM (Qwen2 模型)
reformer — ReformerModelWithLMHead (Reformer 模型)
rembert — RemBertForCausalLM (RemBERT 模型)
roberta — RobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForCausalLM (RoCBert 模型)
roformer — RoFormerForCausalLM (RoFormer 模型)
rwkv — RwkvForCausalLM（RWKV 模型）
speech_to_text_2 — Speech2Text2ForCausalLM（Speech2Text2 模型）
transfo-xl — TransfoXLLMHeadModel（Transformer-XL 模型）
trocr — TrOCRForCausalLM（TrOCR 模型）
whisper — WhisperForCausalLM（Whisper 模型）
xglm — XGLMForCausalLM（XGLM 模型）
xlm — XLMWithLMHeadModel（XLM 模型）
xlm-prophetnet — XLMProphetNetForCausalLM（XLM-ProphetNet 模型）
xlm-roberta — XLMRobertaForCausalLM（XLM-RoBERTa 模型）
xlm-roberta-xl — XLMRobertaXLForCausalLM（XLM-RoBERTa-XL 模型）
xlnet — XLNetLMHeadModel（XLNet 模型）
xmod — XmodForCausalLM（X-MOD 模型）

默认情况下，该模型处于评估模式，使用model.eval()（例如，关闭了 dropout 模块）。要训练模型，您应该首先使用model.train()将其设置回训练模式

示例：

>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForCausalLM

`class transformers.TFAutoModelForCausalLM`

< source >

( *args **kwargs )

这是一个通用的模型类，当使用 class method 或 class method 创建时，将作为库中的模型类之一实例化（带有因果语言建模头）。

这个类不能直接使用__init__()实例化（会抛出错误）。

`from_config`

< source >

( **kwargs )

参数

config（PretrainedConfig）—将要实例化的模型类是基于配置类选择的：

BertConfig 配置类：TFBertLMHeadModel（BERT 模型）
CTRLConfig 配置类：TFCTRLLMHeadModel（CTRL 模型）
CamembertConfig 配置类：TFCamembertForCausalLM（CamemBERT 模型）
GPT2Config 配置类：TFGPT2LMHeadModel（OpenAI GPT-2 模型）
GPTJConfig 配置类：TFGPTJForCausalLM（GPT-J 模型）
OPTConfig 配置类: TFOPTForCausalLM (OPT 模型)
OpenAIGPTConfig 配置类: TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
RemBertConfig 配置类: TFRemBertForCausalLM (RemBERT 模型)
RoFormerConfig 配置类: TFRoFormerForCausalLM (RoFormer 模型)
RobertaConfig 配置类: TFRobertaForCausalLM (RoBERTa 模型)
RobertaPreLayerNormConfig 配置类: TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
TransfoXLConfig 配置类: TFTransfoXLLMHeadModel (Transformer-XL 模型)
XGLMConfig 配置类: TFXGLMForCausalLM (XGLM 模型)
XLMConfig 配置类: TFXLMWithLMHeadModel (XLM 模型)
XLMRobertaConfig 配置类: TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
XLNetConfig 配置类: TFXLNetLMHeadModel (XLNet 模型)

从配置实例化库中的一个模型类（带有因果语言建模头）。

注意：从配置文件加载模型不会加载模型权重。它只影响模型的配置。使用 from_pretrained() 来加载模型权重。

示例：

>>> from transformers import AutoConfig, TFAutoModelForCausalLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)

`from_pretrained`

< source >

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：

一个字符串，预训练模型的模型 id，托管在 huggingface.co 上的模型存储库中。有效的模型 id 可以位于根级别，如 bert-base-uncased，或者命名空间在用户或组织名称下，如 dbmdz/bert-base-german-cased。
一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如，./my_model_directory/。
路径或 url 到PyTorch 状态字典保存文件（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt应设置为True，并且应将配置对象作为config参数提供。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型并随后加载 TensorFlow 模型要慢。

model_args（额外的位置参数，可选）— 将传递给底层模型__init__()方法。
config（PretrainedConfig，可选）— 模型使用的配置，而不是自动加载的配置。当：

模型是库提供的模型（使用预训练模型的模型 ID字符串加载）。
模型是使用 save_pretrained()保存的，并通过提供保存目录重新加载。
通过提供本地目录作为pretrained_model_name_or_path加载模型，并在目录中找到名为config.json的配置 JSON 文件。

cache_dir（str或os.PathLike，可选）— 如果不使用标准缓存，则应将下载的预训练模型配置缓存在其中的目录路径。
from_pt（bool，可选，默认为False）— 从 PyTorch 检查点保存文件加载模型权重（请参阅pretrained_model_name_or_path参数的文档字符串）。
force_download（bool，可选，默认为False）— 是否强制（重新）下载模型权重和配置文件，覆盖缓存版本（如果存在）。
resume_download（bool，可选，默认为False）— 是否删除接收不完整的文件。如果存在这样的文件，将尝试恢复下载。
proxies（Dict[str, str]，可选）— 一个代理服务器字典，按协议或端点使用，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选，默认为False）— 是否返回包含丢失键、意外键和错误消息的字典。
local_files_only(bool, 可选，默认为False）— 是否仅查看本地文件（例如，不尝试下载模型）。
revision（str，可选，默认为"main"）— 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以revision可以是 git 允许的任何标识符。
trust_remote_code（bool，可选，默认为False）— 是否允许在 Hub 上定义自定义模型的建模文件。此选项应仅对您信任的存储库设置为True，并且您已经阅读了代码，因为它将在本地机器上执行 Hub 上存在的代码。
code_revision（str，可选，默认为"main"）— 用于 Hub 上的代码的特定修订版本，如果代码存储在与模型其余部分不同的存储库中。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以revision可以是 git 允许的任何标识符。
kwargs（额外的关键字参数，可选）— 可以用于更新配置对象（在加载后）并初始化模型（例如，output_attentions=True）。根据是否提供了config，行为会有所不同：

如果提供了config，**kwargs将直接传递给底层模型的__init__方法（我们假设配置的所有相关更新已经完成）。
如果未提供配置，kwargs 将首先传递给配置类初始化函数（from_pretrained()）。与配置属性对应的 kwargs 的每个键将用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有因果语言建模头）。

根据配置对象的 model_type 属性选择要实例化的模型类（如果可能，作为参数传递或从 pretrained_model_name_or_path 加载），或者当缺失时，通过在 pretrained_model_name_or_path 上使用模式匹配来回退：

bert — TFBertLMHeadModel (BERT 模型)
camembert — TFCamembertForCausalLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
gptj — TFGPTJForCausalLM (GPT-J 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — TFOPTForCausalLM (OPT 模型)
rembert — TFRemBertForCausalLM (RemBERT 模型)
roberta — TFRobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForCausalLM (RoFormer 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
xglm — TFXGLMForCausalLM (XGLM 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
xlnet — TFXLNetLMHeadModel (XLNet 模型)

示例:

>>> from transformers import AutoConfig, TFAutoModelForCausalLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

Transformers 4.37 中文文档（十三）（7）https://developer.aliyun.com/article/1564952

Transformers 4.37 中文文档（十三）（6）

TFAutoModelForPreTraining

`class transformers.TFAutoModelForPreTraining`

`from_config`

`from_pretrained`

FlaxAutoModelForPreTraining

`class transformers.FlaxAutoModelForPreTraining`

`from_config`

`from_pretrained`

自然语言处理

AutoModelForCausalLM

`class transformers.AutoModelForCausalLM`

`from_config`

`from_pretrained`

TFAutoModelForCausalLM

`class transformers.TFAutoModelForCausalLM`

`from_config`

`from_pretrained`

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Transformers 4.37 中文文档（十三）（6）

TFAutoModelForPreTraining

class transformers.TFAutoModelForPreTraining

from_config

from_pretrained

FlaxAutoModelForPreTraining

class transformers.FlaxAutoModelForPreTraining

from_config

from_pretrained

自然语言处理

AutoModelForCausalLM

class transformers.AutoModelForCausalLM

from_config

from_pretrained

TFAutoModelForCausalLM

class transformers.TFAutoModelForCausalLM

from_config

from_pretrained

热门文章

最新文章

相关电子书

`class transformers.TFAutoModelForPreTraining`

`from_config`

`from_pretrained`

`class transformers.FlaxAutoModelForPreTraining`

`from_config`

`from_pretrained`

`class transformers.AutoModelForCausalLM`

`from_config`

`from_pretrained`

`class transformers.TFAutoModelForCausalLM`

`from_config`

`from_pretrained`