Transformers 4.37 中文文档(九十九)(6)

简介: Transformers 4.37 中文文档(九十九)

Transformers 4.37 中文文档(九十九)(5)https://developer.aliyun.com/article/1564045


class transformers.NoBadWordsLogitsProcessor

<来源>

( bad_words_ids: List eos_token_id: Union )

参数

  • bad_words_ids (List[List[int]]) — 不允许生成的标记 ID 列表。
  • eos_token_id (Union[int, List[int]]) — 结束序列标记的 ID。可选地,使用列表设置多个结束序列标记。

LogitsProcessor,强制指定的序列永远不会被选中。

为了获取不应出现在生成文本中的单词的标记 ID,请确保在初始化分词器时设置add_prefix_space=True,并使用tokenizer(bad_words, add_special_tokens=False).input_idsadd_prefix_space参数仅支持一些慢速分词器,因为快速分词器的前缀行为来自pre tokenizers。在这里阅读更多信息(https://huggingface.co/docs/tokenizers/api/pre-tokenizers)。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> inputs = tokenizer(["In a word, the cake is a"], return_tensors="pt")
>>> output_ids = model.generate(inputs["input_ids"], max_new_tokens=5, pad_token_id=tokenizer.eos_token_id)
>>> print(tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0])
In a word, the cake is a bit of a mess.
>>> # Now let's take the bad words out. Please note that the tokenizer is initialized differently
>>> tokenizer_with_prefix_space = AutoTokenizer.from_pretrained("gpt2", add_prefix_space=True)
>>> def get_tokens_as_list(word_list):
...     "Converts a sequence of words into a list of tokens"
...     tokens_list = []
...     for word in word_list:
...         tokenized_word = tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0]
...         tokens_list.append(tokenized_word)
...     return tokens_list
>>> bad_words_ids = get_tokens_as_list(word_list=["mess"])
>>> output_ids = model.generate(
...     inputs["input_ids"], max_new_tokens=5, bad_words_ids=bad_words_ids, pad_token_id=tokenizer.eos_token_id
... )
>>> print(tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0])
In a word, the cake is a bit of a surprise.
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor of shape (batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax

返回

torch.FloatTensor of shape (batch_size, config.vocab_size)

处理后的预测分数。

class transformers.NoRepeatNGramLogitsProcessor

<来源>

( ngram_size: int )

参数

  • ngram_size (int) — 所有大小为ngram_size的 ngrams 只能出现一次。

N-grams  是从文本序列中获取的“n”个连续单词、字符或标记的组合。给定句子:“她跑得快”,二元组(n=2)将是(“她”,“跑”)和(“跑”,“快”)。在文本生成中,避免单词序列的重复提供了更多样化的输出。这个  LogitsProcessor 通过将被禁止的标记的分数设置为负无穷来强制不重复  n-grams,从而消除了这些标记在进一步处理分数时的考虑。请注意,对于大多数仅解码器模型(如大多数 LLMs),提示也被视为获取  n-grams。Fairseq

谨慎使用 n-gram 惩罚。例如,在关于纽约市的文章中惩罚 2-gram(二元组)可能导致不良结果,其中城市的名称仅出现一次在整个文本中。参考

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
>>> inputs = tokenizer(["Today I"], return_tensors="pt")
>>> output = model.generate(**inputs)
>>> print(tokenizer.decode(output[0], skip_special_tokens=True))
Today I’m not sure if I’m going to be able to do it.
>>> # Now let's add ngram size using `no_repeat_ngram_size`. This stops the repetitions ("I’m") in the output.
>>> output = model.generate(**inputs, no_repeat_ngram_size=2)
>>> print(tokenizer.decode(output[0], skip_special_tokens=True))
Today I’m not sure if I can get a better understanding of the nature of this issue
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor of shape (batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax

返回

torch.FloatTensor of shape (batch_size, config.vocab_size)

处理后的预测分数。

class transformers.PrefixConstrainedLogitsProcessor

<来源>

( prefix_allowed_tokens_fn: Callable num_beams: int )

参数

  • prefix_allowed_tokens_fn (Callable[[int, torch.Tensor], List[int]]) — 此函数将波束搜索限制为每个步骤仅允许的标记。此函数接受 2 个参数inputs_ids和批次 IDbatch_id。它必须返回一个列表,其中包含下一代步骤的允许标记,条件是先前生成的标记inputs_ids和批次 IDbatch_id

LogitsProcessor 强制执行受限制的生成,对于前缀条件的受限制生成很有用。有关更多信息,请参阅自回归实体检索

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m")
>>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
>>> inputs = tokenizer("Alice and Bob", return_tensors="pt")
>>> # By default, it continues generating according to the model's logits
>>> outputs = model.generate(**inputs, max_new_tokens=5)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
Alice and Bob are friends
>>> # We can contrain it with `prefix_allowed_tokens_fn` to force a certain behavior based on a prefix.
>>> # For instance, we can force an entire entity to be generated when its beginning is detected.
>>> entity =  tokenizer(" Bob Marley", return_tensors="pt").input_ids[0]  # 3 tokens
>>> def prefix_allowed_tokens_fn(batch_id, input_ids):
...     '''
...     Attempts to generate 'Bob Marley' when 'Bob' is detected.
...     In this case, `batch_id` is not used, but you can set rules for each batch member.
...     '''
...     if input_ids[-1] == entity[0]:
...         return entity[1]
...     elif input_ids[-2] == entity[0] and input_ids[-1] == entity[1]:
...         return entity[2]
...     return list(range(tokenizer.vocab_size))  # If no match, allow all tokens
>>> outputs = model.generate(**inputs, max_new_tokens=5, prefix_allowed_tokens_fn=prefix_allowed_tokens_fn)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
Alice and Bob Marley

__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。

返回值

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。

class transformers.RepetitionPenaltyLogitsProcessor

<来源>

( penalty: float )

参数

  • penalty (float) — 重复惩罚的参数。1.0 表示没有惩罚。大于 1.0 会惩罚先前生成的标记。在 0.0 和 1.0 之间会奖励先前生成的标记。

LogitsProcessor 通过惩罚防止先前标记的重复。此惩罚最多每个标记应用一次。请注意,对于大多数仅解码器模型(如大多数 LLMs),考虑的标记包括提示。

在原始论文中,作者建议使用约 1.2 的惩罚来实现真实生成和减少重复之间的良好平衡。为了惩罚和减少重复,使用大于 1.0 的penalty值,其中较高的值会更强烈地惩罚。为了奖励和鼓励重复,使用 0.0 和 1.0 之间的penalty值,较低的值会更强烈地奖励。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> # Initializing the model and tokenizer for it
>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
>>> inputs = tokenizer(["I'm not going to"], return_tensors="pt")
>>> # This shows a normal generate without any specific parameters
>>> summary_ids = model.generate(**inputs)
>>> print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True)[0])
I'm not going to be able to do that. I'm going to be able to do that
>>> # This generates a penalty for repeated tokens
>>> penalized_ids = model.generate(**inputs, repetition_penalty=1.1)
>>> print(tokenizer.batch_decode(penalized_ids, skip_special_tokens=True)[0])
I'm not going to be able to do that. I'll just have to go out and play

__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。

返回值

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。

class transformers.SequenceBiasLogitsProcessor

<来源>

( sequence_bias: Dict )

参数

  • sequence_bias (Dict[Tuple[int], float]) — 将标记序列映射到其偏差项的字典。正偏差增加选择该序列的几率,而负偏差则相反。如果序列长度为 1,则其偏差将始终应用。否则,仅当所讨论的序列即将完成时(在应用此处理器后的标记选择步骤中)才会应用偏差。

LogitsProcessor  应用于序列的附加偏置。当下一个生成的标记可以完成序列时,将偏置应用于序列的最后一个标记。因此,为了充分利用对具有多个标记的序列进行偏置,考虑使用波束方法(以优雅地解决部分完成的序列具有负偏差的问题)并将偏置应用于它们的前缀(以确保较早地应用偏置)。

为了获取您想要偏置的序列的标记 ID,请确保在初始化分词器时设置add_prefix_space=True,并使用tokenizer(bad_words, add_special_tokens=False).input_idsadd_prefix_space参数仅支持一些慢速分词器,因为快速分词器的前缀行为来自pre tokenizers在这里阅读更多。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> inputs = tokenizer(["The full name of Donald is Donald"], return_tensors="pt")
>>> summary_ids = model.generate(inputs["input_ids"], max_new_tokens=4)
>>> print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True)[0])
The full name of Donald is Donald J. Trump Jr
>>> # Now let's control generation through a bias. Please note that the tokenizer is initialized differently!
>>> tokenizer_with_prefix_space = AutoTokenizer.from_pretrained("gpt2", add_prefix_space=True)
>>> def get_tokens_as_tuple(word):
...     return tuple(tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0])
>>> # If we add a negative bias without beam search, it may become "stuck" in a prefix without good continuations
>>> sequence_bias = {get_tokens_as_tuple("Trump"): -10.0}
>>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, sequence_bias=sequence_bias)
>>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0])
The full name of Donald is Donald J. Donald,
>>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, num_beams=4, sequence_bias=sequence_bias)
>>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0])
The full name of Donald is Donald Rumsfeld,
>>> # We can also add a positive bias to nudge the model towards specific tokens or continuations
>>> sequence_bias = {get_tokens_as_tuple("Donald Duck"): 10.0}
>>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, num_beams=4, sequence_bias=sequence_bias)
>>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0])
The full name of Donald is Donald Duck.
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax

返回

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。

class transformers.SuppressTokensAtBeginLogitsProcessor

<来源>

( begin_suppress_tokens begin_index )

SuppressTokensAtBeginLogitsProcessor 在generate函数开始生成时立即抑制一系列标记,使用begin_index标记。这应该确保由begin_suppress_tokens定义的标记在开始时不会被生成。最初为Whisper创建。

示例:

>>> from transformers import AutoProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> inputs = processor(ds[0]["audio"]["array"], return_tensors="pt")
>>> # Whisper has `begin_suppress_tokens` set by default (= `[220, 50256]`). 50256 is the EOS token, so this means
>>> # it can't generate and EOS token in the first iteration, but it can in the others.
>>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)
>>> print(outputs.scores[1][0, 50256])  # 1 (and not 0) is the first freely generated token
tensor(-inf)
>>> print(outputs.scores[-1][0, 50256])  # in other places we can see some probability mass for EOS
tensor(29.9010)
>>> # If we disable `begin_suppress_tokens`, we can generate EOS in the first iteration.
>>> outputs = model.generate(
...     **inputs, return_dict_in_generate=True, output_scores=True, begin_suppress_tokens=None
... )
>>> print(outputs.scores[1][0, 50256])
tensor(11.2027)
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax

返回

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。

class transformers.SuppressTokensLogitsProcessor

<来源>

( suppress_tokens )

此处理器可用于抑制一系列标记。处理器将将它们的对数概率设置为-inf,以便它们不会被生成。最初为Whisper创建。

示例:

>>> from transformers import AutoProcessor, WhisperForConditionalGeneration
>>> from datasets import load_dataset
>>> processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en")
>>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> inputs = processor(ds[0]["audio"]["array"], return_tensors="pt")
>>> # Whisper has a long list of suppressed tokens. For instance, in this case, the token 1 is suppressed by default.
>>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)
>>> print(outputs.scores[1][0, 1])  # 1 (and not 0) is the first freely generated token
tensor(-inf)
>>> # If we disable `suppress_tokens`, we can generate it.
>>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True, suppress_tokens=None)
>>> print(outputs.scores[1][0, 1])
tensor(5.7738)
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax

返回

torch.FloatTensor of shape (batch_size, config.vocab_size)

处理后的预测分数。

class transformers.TemperatureLogitsWarper

< source >

( temperature: float )

参数

  • temperature (float) — 用于调节 logits 分布的严格正值浮点值。小于 1 的值会减少随机性(反之亦然),0 相当于将所有概率质量转移到最可能的标记。

LogitsWarper 用于温度(指数缩放输出概率分布),这有效地意味着它可以控制预测标记的随机性。通常与 TopPLogitsWarper 和 TopKLogitsWarper 一起使用。

确保在 generate 参数中包含 do_sample=True,否则温度值将不会产生任何效果。

示例:

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> set_seed(0)  # for reproducibility
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> model.config.pad_token_id = model.config.eos_token_id
>>> inputs = tokenizer(["Hugging Face Company is"], return_tensors="pt")
>>> # With temperature=1.0, the default, we consistently get random outputs due to random sampling.
>>> generate_kwargs = {"max_new_tokens": 10, "do_sample": True, "temperature": 1.0, "num_return_sequences": 2}
>>> outputs = model.generate(**inputs, **generate_kwargs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Hugging Face Company is a joint venture between GEO Group, one of',
'Hugging Face Company is not an exact science – but what we believe does']
>>> # However, with temperature close to 0, it approximates greedy decoding strategies (invariant)
>>> generate_kwargs["temperature"] = 0.0001
>>> outputs = model.generate(**inputs, **generate_kwargs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Hugging Face Company is a company that has been around for over 20 years',
'Hugging Face Company is a company that has been around for over 20 years']
__call__

< source >

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) — 输入序列标记在词汇表中的索引。什么是输入 ID?
  • scores (torch.FloatTensor of shape (batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax

返回

torch.FloatTensor of shape (batch_size, config.vocab_size)

处理后的预测分数。

class transformers.TopKLogitsWarper

< source >

( top_k: int filter_value: float = -inf min_tokens_to_keep: int = 1 )

参数

  • top_k (int) — 要保留的最高概率词汇标记的数量。
  • filter_value (float, 可选, 默认为 -inf) — 所有过滤值将被设置为此浮点值。
  • min_tokens_to_keep (int, 可选, 默认为 1) — 不能被过滤的最小标记数量。

LogitsWarper 执行 top-k,即限制为最高概率元素 k。通常与 TemperatureLogitsWarper 和 TopPLogitsWarper 一起使用。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> set_seed(0)
>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
>>> inputs = tokenizer("A sequence: A, B, C, D", return_tensors="pt")
>>> # With sampling, the output is unexpected -- sometimes too unexpected.
>>> outputs = model.generate(**inputs, do_sample=True)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
A sequence: A, B, C, D, G, H, I. A, M
>>> # With `top_k` sampling, the output gets restricted the k most likely tokens.
>>> # Pro tip: In practice, LLMs use `top_k` in the 5-50 range.
>>> outputs = model.generate(**inputs, do_sample=True, top_k=2)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
A sequence: A, B, C, D, E, F, G, H, I
__call__

< source >

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor of shape (batch_size, sequence_length)) — 输入序列标记在词汇表中的索引。什么是输入 ID?
  • scores (torch.FloatTensor of shape (batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax

返回

torch.FloatTensor of shape (batch_size, config.vocab_size)

处理后的预测分数。

class transformers.TopPLogitsWarper

< source >

( top_p: float filter_value: float = -inf min_tokens_to_keep: int = 1 )

参数

  • top_p (float) — 如果设置为 < 1,则仅保留概率相加达到 top_p 或更高的最可能标记集合用于生成。
  • filter_value (float, 可选, 默认为 -inf) — 所有过滤值将被设置为此浮点值。
  • min_tokens_to_keep (int, 可选, 默认为 1) — 不能被过滤的最小标记数量。

LogitsWarper 执行 top-p,即限制总和小于等于 prob_cut_off 的前几个标记。通常与 TemperatureLogitsWarper 和 TopKLogitsWarper 一起使用。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> set_seed(0)
>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
>>> inputs = tokenizer("A sequence: 1, 2", return_tensors="pt")
>>> # With sampling, the output is unexpected -- sometimes too unexpected.
>>> outputs = model.generate(**inputs, do_sample=True)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
A sequence: 1, 2, 0, 2, 2. 2, 2, 2, 2
>>> # With `top_p` sampling, the output gets restricted to high-probability tokens.
>>> # Pro tip: In practice, LLMs use `top_p` in the 0.9-0.95 range.
>>> outputs = model.generate(**inputs, do_sample=True, top_p=0.1)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
A sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax。

返回

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。

class transformers.TypicalLogitsWarper

<来源>

( mass: float = 0.9 filter_value: float = -inf min_tokens_to_keep: int = 1 )

参数

  • mass (float, 可选, 默认为 0.9) — 典型 p 值在 0 到 1 之间,默认为 0.9。
  • filter_value (float可选,默认为-inf) — 所有被过滤的值将被设置为此浮点值。
  • min_tokens_to_keep (int可选,默认为 1) — 不能被过滤的最小标记数。

LogitsWarper 执行典型解码。受到人类如何使用语言的启发,它优先考虑对数概率接近标记概率分布的熵的标记。这意味着在过程中可能会丢弃最有可能的标记。

查看自然语言生成的典型解码获取更多信息。

示例:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed
>>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m")
>>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m")
>>> inputs = tokenizer("1, 2, 3", return_tensors="pt")
>>> # We can see that greedy decoding produces a sequence of numbers
>>> outputs = model.generate(**inputs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
>>> # For this particular seed, we can see that sampling produces nearly the same low-information (= low entropy)
>>> # sequence
>>> set_seed(18)
>>> outputs = model.generate(**inputs, do_sample=True)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
1, 2, 3, 4, 5, 6, 7, 8, 9 and 10
>>> # With `typical_p` set, the most obvious sequence is no longer produced, which may be good for your problem
>>> set_seed(18)
>>> outputs = model.generate(
...     **inputs, do_sample=True, typical_p=0.1, return_dict_in_generate=True, output_scores=True
... )
>>> print(tokenizer.batch_decode(outputs.sequences, skip_special_tokens=True)[0])
1, 2, 3 and 5
>>> # We can see that the token corresponding to "4" (token 934) in the second position, the most likely token
>>> # as seen with greedy decoding, was entirely blocked out
>>> print(outputs.scores[1][0, 934])
tensor(-inf)
__call__

<来源>

( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)

参数

  • input_ids (torch.LongTensor,形状为(batch_size, sequence_length)) — 词汇表中输入序列标记的索引。什么是输入 ID?
  • scores (torch.FloatTensor,形状为(batch_size, config.vocab_size)) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax。

返回

torch.FloatTensor,形状为(batch_size, config.vocab_size)

处理后的预测分数。


Transformers 4.37 中文文档(九十九)(7)https://developer.aliyun.com/article/1564047

相关文章
|
4月前
|
区块链 索引
Transformers 4.37 中文文档(九十九)(5)
Transformers 4.37 中文文档(九十九)
52 1
|
4月前
|
存储 自然语言处理 算法框架/工具
Transformers 4.37 中文文档(九十九)(1)
Transformers 4.37 中文文档(九十九)
101 1
|
4月前
|
缓存 TensorFlow 区块链
Transformers 4.37 中文文档(九十九)(7)
Transformers 4.37 中文文档(九十九)
62 1
|
4月前
|
PyTorch 算法框架/工具 计算机视觉
Transformers 4.37 中文文档(七十二)(5)
Transformers 4.37 中文文档(七十二)
29 1
|
4月前
|
存储 缓存 并行计算
Transformers 4.37 中文文档(九十九)(8)
Transformers 4.37 中文文档(九十九)
129 0
|
4月前
|
PyTorch TensorFlow 算法框架/工具
Transformers 4.37 中文文档(九十九)(4)
Transformers 4.37 中文文档(九十九)
33 0
|
4月前
|
存储 JSON 算法框架/工具
Transformers 4.37 中文文档(九十九)(3)
Transformers 4.37 中文文档(九十九)
95 0
|
4月前
|
存储 自然语言处理 PyTorch
Transformers 4.37 中文文档(九十九)(2)
Transformers 4.37 中文文档(九十九)
63 0
|
4月前
|
机器学习/深度学习 编解码 PyTorch
Transformers 4.37 中文文档(七十二)(4)
Transformers 4.37 中文文档(七十二)
26 0
|
4月前
|
存储 PyTorch 测试技术
Transformers 4.37 中文文档(七十二)(3)
Transformers 4.37 中文文档(七十二)
55 0