Transformers 4.37 中文文档(九十九)(4)https://developer.aliyun.com/article/1564044
class transformers.EtaLogitsWarper
( epsilon: float filter_value: float = -inf min_tokens_to_keep: int = 1 )
参数
epsilon
(float
) — 在范围(0, 1)内的浮点值。用于计算动态截断值eta
的超参数。根据论文,建议的值范围为 3e-4 到 4e-3,具体取决于模型的大小。filter_value
(float
,可选,默认为-inf) — 所有低于动态截断值eta
的值都设置为此浮点值。当需要修改 logits 以排除生成过程中应完全排除的概率非常低的标记时,此参数很有用。min_tokens_to_keep
(int
,可选,默认为 1) — 指定必须保留的最小标记数,无论它们的概率如何。例如,如果将min_tokens_to_keep
设置为 1,则始终会保留至少一个标记用于生成,即使所有标记的概率都低于截断eta
。
LogitsWarper 执行 eta 采样,一种过滤掉概率低于动态截断值eta
的标记的技术,该值是基于超参数epsilon
和标记概率的熵的组合计算得出的,即eta := min(epsilon, sqrt(epsilon * e^-entropy(probabilities)))
。如果没有标记满足此约束,则保留最大的min_tokens_to_keep
个标记。它解决了由神经语言模型生成的长文本样本中存在的质量差问题,从而生成更连贯和流畅的文本。有关更多信息,请参阅截断采样作为语言模型去平滑。注意:必须将do_sample
设置为True
,才能使此LogitsWarper
正常工作。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> set_seed(0) >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer("A sequence: 1, 2", return_tensors="pt") >>> # With sampling, the output is unexpected -- sometimes too unexpected. >>> outputs = model.generate(**inputs, do_sample=True) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: 1, 2, 0, 2, 2. 2, 2, 2, 2 >>> # With eta sampling, the output gets restricted to high-probability tokens. You can see it as a dynamic form of >>> # epsilon sampling that adapts its cutoff probability based on the entropy (high entropy = lower cutoff). >>> # Pro tip: The paper recomends using `eta_cutoff` values between 3e-4 to 4e-3 >>> outputs = model.generate(**inputs, do_sample=True, eta_cutoff=0.1) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax
返回
形状为(batch_size, config.vocab_size)
的torch.FloatTensor
处理后的预测分数。
class transformers.ExponentialDecayLengthPenalty
( exponential_decay_length_penalty: Tuple eos_token_id: Union input_ids_seq_length: int )
参数
exponential_decay_length_penalty
(tuple(int, float)
) — 此元组应包含:(start_index, decay_factor)
,其中start_index
表示惩罚开始的位置,decay_factor
表示指数衰减的因子eos_token_id
(Union[int, List[int]]
) — 序列结束标记的 ID。可选择使用列表设置多个序列结束标记。input_ids_seq_length
(int
) — 输入序列的长度。
LogitsProcessor 在达到start_index
后指数增加eos_token_id
的分数。这允许生成较短的序列而不会有硬性截断,从而使eos_token
能够在有意义的位置被预测。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> model = AutoModelForCausalLM.from_pretrained("gpt2") >>> tokenizer = AutoTokenizer.from_pretrained("gpt2") >>> text = "Just wanted to let you know, I" >>> inputs = tokenizer(text, return_tensors="pt") >>> # Let's consider that we want short sentences, so we limit `max_length=30`. However, we observe that the answer >>> # tends to end abruptly. >>> set_seed(1) >>> outputs = model.generate(**inputs, do_sample=True, temperature=0.9, max_length=30, pad_token_id=50256) >>> print(tokenizer.batch_decode(outputs)[0]) Just wanted to let you know, I received a link to an ebook, the book How To Start A Social Network which was published in 2010. Although >>> # To promote the appearance of the EOS token at the right time, we add the `exponential_decay_length_penalty = >>> # (start_index, decay_factor)`. Instead of cutting at max_tokens, the output comes to an end before and usually >>> # with more meaning. What happens is that starting from `start_index` the EOS token score will be increased >>> # by `decay_factor` exponentially. However, if you set a high decay factor, you may also end up with abruptly >>> # ending sequences. >>> set_seed(1) >>> outputs = model.generate( ... **inputs, ... do_sample=True, ... temperature=0.9, ... max_length=30, ... pad_token_id=50256, ... exponential_decay_length_penalty=(15, 1.6), ... ) >>> print(tokenizer.batch_decode(outputs)[0]) Just wanted to let you know, I received a link to an ebook, the book How To Start A Social Network which<|endoftext|> >>> # With a small decay factor, you will have a higher chance of getting a meaningful sequence. >>> set_seed(1) >>> outputs = model.generate( ... **inputs, ... do_sample=True, ... temperature=0.9, ... max_length=30, ... pad_token_id=50256, ... exponential_decay_length_penalty=(15, 1.01), ... ) >>> print(tokenizer.batch_decode(outputs)[0]) Just wanted to let you know, I received a link to an ebook, the book How To Start A Social Network which was published in 2010.<|endoftext|>
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.ForcedBOSTokenLogitsProcessor
( bos_token_id: int )
参数
bos_token_id
(int
) — 强制作为第一个生成的标记的标记 ID。
LogitsProcessor 会强制指定的标记作为第一个生成的标记。与编码器-解码器模型一起使用。
示例:
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM >>> model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small") >>> tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small") >>> inputs = tokenizer("Translate from English to German: I love cats.", return_tensors="pt") >>> # By default, it continues generating according to the model's logits >>> outputs = model.generate(**inputs, max_new_tokens=10) >>> print(tokenizer.batch_decode(outputs)[0]) <pad> Ich liebe Kitty.</s> >>> # We can use `forced_bos_token_id` to force the start of generation with an encoder-decoder model >>> # (including forcing it to end straight away with an EOS token) >>> outputs = model.generate(**inputs, max_new_tokens=10, forced_bos_token_id=tokenizer.eos_token_id) >>> print(tokenizer.batch_decode(outputs)[0]) <pad></s>
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.ForcedEOSTokenLogitsProcessor
( max_length: int eos_token_id: Union )
参数
max_length
(int
) — 要生成的序列的最大长度。eos_token_id
(Union[int, List[int]]
) — 当达到max_length
时,强制指定的标记作为最后生成的标记。可选择使用列表设置多个序列结束标记。
LogitsProcessor 会在达到max_length
时强制指定的标记作为最后生成的标记。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer("A sequence: 1, 2, 3", return_tensors="pt") >>> # By default, it continues generating according to the model's logits >>> outputs = model.generate(**inputs, max_new_tokens=10) >>> print(tokenizer.batch_decode(outputs)[0]) A sequence: 1, 2, 3, 4, 5, 6, 7, 8 >>> # `forced_eos_token_id` ensures the generation ends with a EOS token >>> outputs = model.generate(**inputs, max_new_tokens=10, forced_eos_token_id=tokenizer.eos_token_id) >>> print(tokenizer.batch_decode(outputs)[0]) A sequence: 1, 2, 3, 4, 5, 6, 7,<|endoftext|>
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.ForceTokensLogitsProcessor
( force_token_map: List )
这个处理器接受一对整数的列表,指示从生成索引到强制生成之前的标记索引的映射。处理器将它们的 log 概率设置为inf
,以便在相应的索引处对它们进行采样。最初为Whisper创建。
示例:
>>> from transformers import AutoProcessor, WhisperForConditionalGeneration >>> from datasets import load_dataset >>> processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en") >>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en") >>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation") >>> inputs = processor(ds[0]["audio"]["array"], return_tensors="pt") >>> # This Whisper model forces the generation to start with `50362` at the first position by default, i.e. >>> # `"forced_decoder_ids": [[1, 50362]]`. This means all other tokens are masked out. >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) >>> print( ... all(outputs.scores[0][0, i] == float("-inf") for i in range(processor.tokenizer.vocab_size) if i != 50362) ... ) True >>> print(outputs.scores[0][0, 50362]) tensor(0.) >>> # If we disable `forced_decoder_ids`, we stop seeing that effect >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True, forced_decoder_ids=None) >>> print( ... all(outputs.scores[0][0, i] == float("-inf") for i in range(processor.tokenizer.vocab_size) if i != 50362) ... ) False >>> print(outputs.scores[0][0, 50362]) tensor(19.3140)
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
形状为 (batch_size, config.vocab_size)
的 torch.FloatTensor
处理后的预测分数。
class transformers.HammingDiversityLogitsProcessor
( diversity_penalty: float num_beams: int num_beam_groups: int )
参数
diversity_penalty
(float
) — 如果一个 beam 在特定时间生成与其他组中的任何 beam 相同的标记,则从该 beam 的分数中减去此值。较高的diversity_penalty
将强制在 beam 之间实现更大的多样性。调整此值可以帮助在多样性和自然可能性之间取得平衡。num_beams
(int
) — beam 搜索的数量。1 表示没有 beam 搜索。num_beam_groups
(int
) — 将num_beams
分成多少组,以确保不同组的 beam 之间的多样性。此论文 了解更多细节。
LogitsProcessor 用于强制进行多样性 beam 搜索。
请注意,此 logits 处理器仅对 PreTrainedModel.group_beam_search() 有效。有关更多细节,请参阅 Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models。
传统的 beam 搜索经常在不同 beam 之间生成非常相似的序列。HammingDiversityLogitsProcessor
通过惩罚在同一时间步生成已被其他 beam 选择的标记的 beam 来解决这个问题。
示例:
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM >>> import torch >>> # Initialize the model and tokenizer >>> tokenizer = AutoTokenizer.from_pretrained("t5-base") >>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base") >>> # A long text about the solar system >>> text = ( ... "The Solar System is a gravitationally bound system comprising the Sun and the objects that orbit it, " ... "either directly or indirectly. Of the objects that orbit the Sun directly, the largest are the eight " ... "planets, with the remainder being smaller objects, such as the five dwarf planets and small Solar System " ... "bodies. The Solar System formed 4.6 billion years ago from the gravitational collapse of a giant " ... "interstellar molecular cloud." ... ) >>> inputs = tokenizer("summarize: " + text, return_tensors="pt") >>> # Generate diverse summary >>> outputs_diverse = model.generate( ... **inputs, ... num_beam_groups=2, ... diversity_penalty=10.0, ... max_length=100, ... num_beams=4, ... num_return_sequences=2, ... ) >>> summaries_diverse = tokenizer.batch_decode(outputs_diverse, skip_special_tokens=True) >>> # Generate non-diverse summary >>> outputs_non_diverse = model.generate( ... **inputs, ... max_length=100, ... num_beams=4, ... num_return_sequences=2, ... ) >>> summary_non_diverse = tokenizer.batch_decode(outputs_non_diverse, skip_special_tokens=True) >>> # With `diversity_penalty`, the resulting beams are much more diverse >>> print(summary_non_diverse) ['the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.', 'the Solar System formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.'] >>> print(summaries_diverse) ['the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.', 'the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets. the rest of the objects are smaller objects, such as the five dwarf planets and small solar system bodies.']
__call__
( input_ids: LongTensor scores: FloatTensor current_tokens: LongTensor beam_group_idx: int ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用 beam 搜索时,这些可以是每个词汇表的 logits,或者在使用 beam 搜索时,可以是每个词汇表标记的 log softmaxcurrent_tokens
(torch.LongTensor
,形状为(batch_size)
) — 输入序列标记在词汇表中的索引,对应于当前生成步骤中其他 beam 组选择的标记。beam_group_idx
(int
) — 当前正在处理的 beam 组的索引。
返回
形状为 (batch_size, config.vocab_size)
的 torch.FloatTensor
处理后的预测分数。
class transformers.InfNanRemoveLogitsProcessor
( )
LogitsProcessor 用于移除所有的 nan
和 inf
值,以避免生成方法失败。请注意,只有在必要时才应该使用 logits 处理器,因为它可能会减慢生成方法的速度。
这个 logits 处理器没有 generate
示例,因为不应该有正确的标志组合来保证其使用。
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用 beam 搜索时,这些可以是每个词汇表的 logits,或者在使用 beam 搜索时,可以是每个词汇表标记的 log softmax
返回值
形状为 (batch_size, config.vocab_size)
的 torch.FloatTensor
处理后的预测分数。
class transformers.LogitNormalization
( )
LogitsWarper 和 LogitsProcessor 用于使用 log-softmax 对分数进行归一化。在应用 logits 处理器或 warper 后,在波束搜索期间对分数进行归一化是很重要的,因为此库中使用的搜索算法不会这样做(它只在之前这样做,但它们可能需要重新归一化),但它仍然假设在比较假设时分数已经归一化。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> import torch >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer("A sequence: 1, 2, 3", return_tensors="pt") >>> # By default, the scores are not normalized -- the sum of their exponentials is NOT a normalized probability >>> # distribution, summing to 1 >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) >>> print(torch.sum(torch.exp(outputs.scores[-1]))) tensor(816.3250) >>> # Normalizing them may have a positive impact on beam methods, or when using the scores on your application >>> outputs = model.generate(**inputs, renormalize_logits=True, return_dict_in_generate=True, output_scores=True) >>> print(torch.sum(torch.exp(outputs.scores[-1]))) tensor(1.0000)
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.LogitsProcessor
( )
所有在生成过程中可以应用的 logit 处理器的抽象基类。
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.LogitsProcessorList
( iterable = () )
这个类可以用来创建一个 LogitsProcessor 或 LogitsWarper 列表,以后处理scores
输入张量。这个类继承自列表,并添加了一个特定的call方法来应用每个 LogitsProcessor 或 LogitsWarper 到输入中。
__call__
( input_ids: LongTensor scores: FloatTensor **kwargs ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmaxkwargs
(Dict[str, Any]
,可选) — 特定于 logits 处理器的其他 kwargs。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.LogitsWarper
( )
所有可以在使用多项式采样进行生成时应用的对数变换器的抽象基类。
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(形状为(batch_size, sequence_length)
的torch.LongTensor
)— 词汇表中输入序列标记的索引。什么是输入 ID?scores
(形状为(batch_size, config.vocab_size)
的torch.FloatTensor
)— 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的对数,或者在使用波束搜索时,可以是每个词汇表标记的对数 softmax
返回
形状为(batch_size, config.vocab_size)
的torch.FloatTensor
处理后的预测分数。
class transformers.MinLengthLogitsProcessor
( min_length: int eos_token_id: Union )
参数
min_length
(int
)— 小于此长度时,eos_token_id
的分数将设置为-float("Inf")
。eos_token_id
(Union[int, List[int]]
)— end-of-sequence标记的 ID。可选地,使用列表设置多个end-of-sequence标记。
LogitsProcessor 通过将 EOS 概率设置为 0 来强制最小长度。请注意,对于像大多数 LLMs 这样的仅解码器模型,长度包括提示。
示例:
>>> from transformers import AutoModelForCausalLM, AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m") >>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m") >>> inputs = tokenizer("A number:", return_tensors="pt") >>> gen_out = model.generate(**inputs) >>> print(tokenizer.batch_decode(gen_out, skip_special_tokens=True)[0]) A number: one >>> # setting `min_length` to a value smaller than the uncontrolled output length has no impact >>> gen_out = model.generate(**inputs, min_length=3) >>> print(tokenizer.batch_decode(gen_out, skip_special_tokens=True)[0]) A number: one >>> # setting a larger `min_length` will force the model to generate beyond its natural ending point, which is not >>> # necessarily incorrect >>> gen_out = model.generate(**inputs, min_length=10) >>> print(tokenizer.batch_decode(gen_out, skip_special_tokens=True)[0]) A number: one thousand, nine hundred and ninety-four
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(形状为(batch_size, sequence_length)
的torch.LongTensor
)— 词汇表中输入序列标记的索引。什么是输入 ID?scores
(形状为(batch_size, config.vocab_size)
的torch.FloatTensor
)— 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的对数,或者在使用波束搜索时,可以是每个词汇表标记的对数 softmax
返回
形状为(batch_size, config.vocab_size)
的torch.FloatTensor
处理后的预测分数。
class transformers.MinNewTokensLengthLogitsProcessor
( prompt_length_to_skip: int min_new_tokens: int eos_token_id: Union )
参数
prompt_length_to_skip
(int
)— 输入标记的长度。当与generate
一起使用时,这不是一个有效的参数,因为它会自动分配输入长度。min_new_tokens
(int
)— 小于此长度时,eos_token_id
的分数将设置为-float("Inf")
。eos_token_id
(Union[int, List[int]]
)— end-of-sequence标记的 ID。可选地,使用列表设置多个end-of-sequence标记。
LogitsProcessor 通过将 EOS(序列结束)标记的概率设置为 0 来强制新标记的最小长度。与 MinLengthLogitsProcessor 相反,此处理器忽略提示。
示例:
>>> from transformers import AutoModelForCausalLM, AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m") >>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m") >>> inputs = tokenizer(["A number:"], return_tensors="pt") >>> gen_out = model.generate(**inputs) >>> print(tokenizer.batch_decode(gen_out, skip_special_tokens=True)[0]) A number: one >>> # setting `min_new_tokens` will force the model to generate beyond its natural ending point, which is not >>> # necessarily incorrect >>> gen_out = model.generate(**inputs, min_new_tokens=2) >>> print(tokenizer.batch_decode(gen_out, skip_special_tokens=True)[0]) A number: one thousand
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(形状为(batch_size, sequence_length)
的torch.LongTensor
)— 词汇表中输入序列标记的索引。什么是输入 ID?scores
(形状为(batch_size, config.vocab_size)
的torch.FloatTensor
)— 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的对数,或者在使用波束搜索时,可以是每个词汇表标记的对数 softmax
返回
形状为(batch_size, config.vocab_size)
的torch.FloatTensor
处理后的预测分数。
Transformers 4.37 中文文档(九十九)(6)https://developer.aliyun.com/article/1564046