Transformers 4.37 中文文档(九十九)(5)https://developer.aliyun.com/article/1564045
class transformers.NoBadWordsLogitsProcessor
( bad_words_ids: List eos_token_id: Union )
参数
bad_words_ids
(List[List[int]]
) — 不允许生成的标记 ID 列表。eos_token_id
(Union[int, List[int]]
) — 结束序列标记的 ID。可选地,使用列表设置多个结束序列标记。
LogitsProcessor,强制指定的序列永远不会被选中。
为了获取不应出现在生成文本中的单词的标记 ID,请确保在初始化分词器时设置add_prefix_space=True
,并使用tokenizer(bad_words, add_special_tokens=False).input_ids
。add_prefix_space
参数仅支持一些慢速分词器,因为快速分词器的前缀行为来自pre tokenizers
。在这里阅读更多信息(https://huggingface.co/docs/tokenizers/api/pre-tokenizers)。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> model = AutoModelForCausalLM.from_pretrained("gpt2") >>> tokenizer = AutoTokenizer.from_pretrained("gpt2") >>> inputs = tokenizer(["In a word, the cake is a"], return_tensors="pt") >>> output_ids = model.generate(inputs["input_ids"], max_new_tokens=5, pad_token_id=tokenizer.eos_token_id) >>> print(tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]) In a word, the cake is a bit of a mess. >>> # Now let's take the bad words out. Please note that the tokenizer is initialized differently >>> tokenizer_with_prefix_space = AutoTokenizer.from_pretrained("gpt2", add_prefix_space=True) >>> def get_tokens_as_list(word_list): ... "Converts a sequence of words into a list of tokens" ... tokens_list = [] ... for word in word_list: ... tokenized_word = tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0] ... tokens_list.append(tokenized_word) ... return tokens_list >>> bad_words_ids = get_tokens_as_list(word_list=["mess"]) >>> output_ids = model.generate( ... inputs["input_ids"], max_new_tokens=5, bad_words_ids=bad_words_ids, pad_token_id=tokenizer.eos_token_id ... ) >>> print(tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]) In a word, the cake is a bit of a surprise.
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
of shape(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
of shape(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax
返回
torch.FloatTensor
of shape (batch_size, config.vocab_size)
处理后的预测分数。
class transformers.NoRepeatNGramLogitsProcessor
( ngram_size: int )
参数
ngram_size
(int
) — 所有大小为ngram_size
的 ngrams 只能出现一次。
N-grams 是从文本序列中获取的“n”个连续单词、字符或标记的组合。给定句子:“她跑得快”,二元组(n=2)将是(“她”,“跑”)和(“跑”,“快”)。在文本生成中,避免单词序列的重复提供了更多样化的输出。这个 LogitsProcessor 通过将被禁止的标记的分数设置为负无穷来强制不重复 n-grams,从而消除了这些标记在进一步处理分数时的考虑。请注意,对于大多数仅解码器模型(如大多数 LLMs),提示也被视为获取 n-grams。Fairseq。
谨慎使用 n-gram 惩罚。例如,在关于纽约市的文章中惩罚 2-gram(二元组)可能导致不良结果,其中城市的名称仅出现一次在整个文本中。参考
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer(["Today I"], return_tensors="pt") >>> output = model.generate(**inputs) >>> print(tokenizer.decode(output[0], skip_special_tokens=True)) Today I’m not sure if I’m going to be able to do it. >>> # Now let's add ngram size using `no_repeat_ngram_size`. This stops the repetitions ("I’m") in the output. >>> output = model.generate(**inputs, no_repeat_ngram_size=2) >>> print(tokenizer.decode(output[0], skip_special_tokens=True)) Today I’m not sure if I can get a better understanding of the nature of this issue
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
of shape(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
of shape(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax
返回
torch.FloatTensor
of shape (batch_size, config.vocab_size)
处理后的预测分数。
class transformers.PrefixConstrainedLogitsProcessor
( prefix_allowed_tokens_fn: Callable num_beams: int )
参数
prefix_allowed_tokens_fn
(Callable[[int, torch.Tensor], List[int]]
) — 此函数将波束搜索限制为每个步骤仅允许的标记。此函数接受 2 个参数inputs_ids
和批次 IDbatch_id
。它必须返回一个列表,其中包含下一代步骤的允许标记,条件是先前生成的标记inputs_ids
和批次 IDbatch_id
。
LogitsProcessor 强制执行受限制的生成,对于前缀条件的受限制生成很有用。有关更多信息,请参阅自回归实体检索。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m") >>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m") >>> inputs = tokenizer("Alice and Bob", return_tensors="pt") >>> # By default, it continues generating according to the model's logits >>> outputs = model.generate(**inputs, max_new_tokens=5) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) Alice and Bob are friends >>> # We can contrain it with `prefix_allowed_tokens_fn` to force a certain behavior based on a prefix. >>> # For instance, we can force an entire entity to be generated when its beginning is detected. >>> entity = tokenizer(" Bob Marley", return_tensors="pt").input_ids[0] # 3 tokens >>> def prefix_allowed_tokens_fn(batch_id, input_ids): ... ''' ... Attempts to generate 'Bob Marley' when 'Bob' is detected. ... In this case, `batch_id` is not used, but you can set rules for each batch member. ... ''' ... if input_ids[-1] == entity[0]: ... return entity[1] ... elif input_ids[-2] == entity[0] and input_ids[-1] == entity[1]: ... return entity[2] ... return list(range(tokenizer.vocab_size)) # If no match, allow all tokens >>> outputs = model.generate(**inputs, max_new_tokens=5, prefix_allowed_tokens_fn=prefix_allowed_tokens_fn) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) Alice and Bob Marley
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回值
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.RepetitionPenaltyLogitsProcessor
( penalty: float )
参数
penalty
(float
) — 重复惩罚的参数。1.0 表示没有惩罚。大于 1.0 会惩罚先前生成的标记。在 0.0 和 1.0 之间会奖励先前生成的标记。
LogitsProcessor 通过惩罚防止先前标记的重复。此惩罚最多每个标记应用一次。请注意,对于大多数仅解码器模型(如大多数 LLMs),考虑的标记包括提示。
在原始论文中,作者建议使用约 1.2 的惩罚来实现真实生成和减少重复之间的良好平衡。为了惩罚和减少重复,使用大于 1.0 的penalty
值,其中较高的值会更强烈地惩罚。为了奖励和鼓励重复,使用 0.0 和 1.0 之间的penalty
值,较低的值会更强烈地奖励。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> # Initializing the model and tokenizer for it >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer(["I'm not going to"], return_tensors="pt") >>> # This shows a normal generate without any specific parameters >>> summary_ids = model.generate(**inputs) >>> print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True)[0]) I'm not going to be able to do that. I'm going to be able to do that >>> # This generates a penalty for repeated tokens >>> penalized_ids = model.generate(**inputs, repetition_penalty=1.1) >>> print(tokenizer.batch_decode(penalized_ids, skip_special_tokens=True)[0]) I'm not going to be able to do that. I'll just have to go out and play
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax。
返回值
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.SequenceBiasLogitsProcessor
( sequence_bias: Dict )
参数
sequence_bias
(Dict[Tuple[int], float]
) — 将标记序列映射到其偏差项的字典。正偏差增加选择该序列的几率,而负偏差则相反。如果序列长度为 1,则其偏差将始终应用。否则,仅当所讨论的序列即将完成时(在应用此处理器后的标记选择步骤中)才会应用偏差。
LogitsProcessor 应用于序列的附加偏置。当下一个生成的标记可以完成序列时,将偏置应用于序列的最后一个标记。因此,为了充分利用对具有多个标记的序列进行偏置,考虑使用波束方法(以优雅地解决部分完成的序列具有负偏差的问题)并将偏置应用于它们的前缀(以确保较早地应用偏置)。
为了获取您想要偏置的序列的标记 ID,请确保在初始化分词器时设置add_prefix_space=True
,并使用tokenizer(bad_words, add_special_tokens=False).input_ids
。add_prefix_space
参数仅支持一些慢速分词器,因为快速分词器的前缀行为来自pre tokenizers
。在这里阅读更多。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM >>> model = AutoModelForCausalLM.from_pretrained("gpt2") >>> tokenizer = AutoTokenizer.from_pretrained("gpt2") >>> inputs = tokenizer(["The full name of Donald is Donald"], return_tensors="pt") >>> summary_ids = model.generate(inputs["input_ids"], max_new_tokens=4) >>> print(tokenizer.batch_decode(summary_ids, skip_special_tokens=True)[0]) The full name of Donald is Donald J. Trump Jr >>> # Now let's control generation through a bias. Please note that the tokenizer is initialized differently! >>> tokenizer_with_prefix_space = AutoTokenizer.from_pretrained("gpt2", add_prefix_space=True) >>> def get_tokens_as_tuple(word): ... return tuple(tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0]) >>> # If we add a negative bias without beam search, it may become "stuck" in a prefix without good continuations >>> sequence_bias = {get_tokens_as_tuple("Trump"): -10.0} >>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, sequence_bias=sequence_bias) >>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0]) The full name of Donald is Donald J. Donald, >>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, num_beams=4, sequence_bias=sequence_bias) >>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0]) The full name of Donald is Donald Rumsfeld, >>> # We can also add a positive bias to nudge the model towards specific tokens or continuations >>> sequence_bias = {get_tokens_as_tuple("Donald Duck"): 10.0} >>> biased_ids = model.generate(inputs["input_ids"], max_new_tokens=4, num_beams=4, sequence_bias=sequence_bias) >>> print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0]) The full name of Donald is Donald Duck.
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.SuppressTokensAtBeginLogitsProcessor
( begin_suppress_tokens begin_index )
SuppressTokensAtBeginLogitsProcessor 在generate
函数开始生成时立即抑制一系列标记,使用begin_index
标记。这应该确保由begin_suppress_tokens
定义的标记在开始时不会被生成。最初为Whisper创建。
示例:
>>> from transformers import AutoProcessor, WhisperForConditionalGeneration >>> from datasets import load_dataset >>> processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en") >>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en") >>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation") >>> inputs = processor(ds[0]["audio"]["array"], return_tensors="pt") >>> # Whisper has `begin_suppress_tokens` set by default (= `[220, 50256]`). 50256 is the EOS token, so this means >>> # it can't generate and EOS token in the first iteration, but it can in the others. >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) >>> print(outputs.scores[1][0, 50256]) # 1 (and not 0) is the first freely generated token tensor(-inf) >>> print(outputs.scores[-1][0, 50256]) # in other places we can see some probability mass for EOS tensor(29.9010) >>> # If we disable `begin_suppress_tokens`, we can generate EOS in the first iteration. >>> outputs = model.generate( ... **inputs, return_dict_in_generate=True, output_scores=True, begin_suppress_tokens=None ... ) >>> print(outputs.scores[1][0, 50256]) tensor(11.2027)
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.SuppressTokensLogitsProcessor
( suppress_tokens )
此处理器可用于抑制一系列标记。处理器将将它们的对数概率设置为-inf
,以便它们不会被生成。最初为Whisper创建。
示例:
>>> from transformers import AutoProcessor, WhisperForConditionalGeneration >>> from datasets import load_dataset >>> processor = AutoProcessor.from_pretrained("openai/whisper-tiny.en") >>> model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en") >>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation") >>> inputs = processor(ds[0]["audio"]["array"], return_tensors="pt") >>> # Whisper has a long list of suppressed tokens. For instance, in this case, the token 1 is suppressed by default. >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True) >>> print(outputs.scores[1][0, 1]) # 1 (and not 0) is the first freely generated token tensor(-inf) >>> # If we disable `suppress_tokens`, we can generate it. >>> outputs = model.generate(**inputs, return_dict_in_generate=True, output_scores=True, suppress_tokens=None) >>> print(outputs.scores[1][0, 1]) tensor(5.7738)
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax
返回
torch.FloatTensor
of shape (batch_size, config.vocab_size)
处理后的预测分数。
class transformers.TemperatureLogitsWarper
( temperature: float )
参数
temperature
(float
) — 用于调节 logits 分布的严格正值浮点值。小于1
的值会减少随机性(反之亦然),0
相当于将所有概率质量转移到最可能的标记。
LogitsWarper 用于温度(指数缩放输出概率分布),这有效地意味着它可以控制预测标记的随机性。通常与 TopPLogitsWarper 和 TopKLogitsWarper 一起使用。
确保在 generate
参数中包含 do_sample=True
,否则温度值将不会产生任何效果。
示例:
>>> import torch >>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> set_seed(0) # for reproducibility >>> tokenizer = AutoTokenizer.from_pretrained("gpt2") >>> model = AutoModelForCausalLM.from_pretrained("gpt2") >>> model.config.pad_token_id = model.config.eos_token_id >>> inputs = tokenizer(["Hugging Face Company is"], return_tensors="pt") >>> # With temperature=1.0, the default, we consistently get random outputs due to random sampling. >>> generate_kwargs = {"max_new_tokens": 10, "do_sample": True, "temperature": 1.0, "num_return_sequences": 2} >>> outputs = model.generate(**inputs, **generate_kwargs) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) ['Hugging Face Company is a joint venture between GEO Group, one of', 'Hugging Face Company is not an exact science – but what we believe does'] >>> # However, with temperature close to 0, it approximates greedy decoding strategies (invariant) >>> generate_kwargs["temperature"] = 0.0001 >>> outputs = model.generate(**inputs, **generate_kwargs) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) ['Hugging Face Company is a company that has been around for over 20 years', 'Hugging Face Company is a company that has been around for over 20 years']
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
of shape(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
of shape(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax
返回
torch.FloatTensor
of shape (batch_size, config.vocab_size)
处理后的预测分数。
class transformers.TopKLogitsWarper
( top_k: int filter_value: float = -inf min_tokens_to_keep: int = 1 )
参数
top_k
(int
) — 要保留的最高概率词汇标记的数量。filter_value
(float
, 可选, 默认为 -inf) — 所有过滤值将被设置为此浮点值。min_tokens_to_keep
(int
, 可选, 默认为 1) — 不能被过滤的最小标记数量。
LogitsWarper 执行 top-k,即限制为最高概率元素 k。通常与 TemperatureLogitsWarper 和 TopPLogitsWarper 一起使用。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> set_seed(0) >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer("A sequence: A, B, C, D", return_tensors="pt") >>> # With sampling, the output is unexpected -- sometimes too unexpected. >>> outputs = model.generate(**inputs, do_sample=True) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: A, B, C, D, G, H, I. A, M >>> # With `top_k` sampling, the output gets restricted the k most likely tokens. >>> # Pro tip: In practice, LLMs use `top_k` in the 5-50 range. >>> outputs = model.generate(**inputs, do_sample=True, top_k=2) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: A, B, C, D, E, F, G, H, I
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
of shape(batch_size, sequence_length)
) — 输入序列标记在词汇表中的索引。什么是输入 ID?scores
(torch.FloatTensor
of shape(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇的 logits,或者在使用波束搜索时,可以是每个词汇标记的 log softmax
返回
torch.FloatTensor
of shape (batch_size, config.vocab_size)
处理后的预测分数。
class transformers.TopPLogitsWarper
( top_p: float filter_value: float = -inf min_tokens_to_keep: int = 1 )
参数
top_p
(float
) — 如果设置为 < 1,则仅保留概率相加达到top_p
或更高的最可能标记集合用于生成。filter_value
(float
, 可选, 默认为 -inf) — 所有过滤值将被设置为此浮点值。min_tokens_to_keep
(int
, 可选, 默认为 1) — 不能被过滤的最小标记数量。
LogitsWarper 执行 top-p,即限制总和小于等于 prob_cut_off 的前几个标记。通常与 TemperatureLogitsWarper 和 TopKLogitsWarper 一起使用。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> set_seed(0) >>> model = AutoModelForCausalLM.from_pretrained("distilgpt2") >>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2") >>> inputs = tokenizer("A sequence: 1, 2", return_tensors="pt") >>> # With sampling, the output is unexpected -- sometimes too unexpected. >>> outputs = model.generate(**inputs, do_sample=True) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: 1, 2, 0, 2, 2. 2, 2, 2, 2 >>> # With `top_p` sampling, the output gets restricted to high-probability tokens. >>> # Pro tip: In practice, LLMs use `top_p` in the 0.9-0.95 range. >>> outputs = model.generate(**inputs, do_sample=True, top_p=0.1) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) A sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
class transformers.TypicalLogitsWarper
( mass: float = 0.9 filter_value: float = -inf min_tokens_to_keep: int = 1 )
参数
mass
(float
, 可选, 默认为 0.9) — 典型 p 值在 0 到 1 之间,默认为 0.9。filter_value
(float
,可选,默认为-inf) — 所有被过滤的值将被设置为此浮点值。min_tokens_to_keep
(int
,可选,默认为 1) — 不能被过滤的最小标记数。
LogitsWarper 执行典型解码。受到人类如何使用语言的启发,它优先考虑对数概率接近标记概率分布的熵的标记。这意味着在过程中可能会丢弃最有可能的标记。
查看自然语言生成的典型解码获取更多信息。
示例:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed >>> model = AutoModelForCausalLM.from_pretrained("bigscience/bloomz-560m") >>> tokenizer = AutoTokenizer.from_pretrained("bigscience/bloomz-560m") >>> inputs = tokenizer("1, 2, 3", return_tensors="pt") >>> # We can see that greedy decoding produces a sequence of numbers >>> outputs = model.generate(**inputs) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, >>> # For this particular seed, we can see that sampling produces nearly the same low-information (= low entropy) >>> # sequence >>> set_seed(18) >>> outputs = model.generate(**inputs, do_sample=True) >>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 >>> # With `typical_p` set, the most obvious sequence is no longer produced, which may be good for your problem >>> set_seed(18) >>> outputs = model.generate( ... **inputs, do_sample=True, typical_p=0.1, return_dict_in_generate=True, output_scores=True ... ) >>> print(tokenizer.batch_decode(outputs.sequences, skip_special_tokens=True)[0]) 1, 2, 3 and 5 >>> # We can see that the token corresponding to "4" (token 934) in the second position, the most likely token >>> # as seen with greedy decoding, was entirely blocked out >>> print(outputs.scores[1][0, 934]) tensor(-inf)
__call__
( input_ids: LongTensor scores: FloatTensor ) → export const metadata = 'undefined';torch.FloatTensor of shape (batch_size, config.vocab_size)
参数
input_ids
(torch.LongTensor
,形状为(batch_size, sequence_length)
) — 词汇表中输入序列标记的索引。什么是输入 ID?scores
(torch.FloatTensor
,形状为(batch_size, config.vocab_size)
) — 语言建模头的预测分数。当不使用波束搜索时,这些可以是每个词汇表的 logits,或者在使用波束搜索时,可以是每个词汇表标记的 log softmax。
返回
torch.FloatTensor
,形状为(batch_size, config.vocab_size)
处理后的预测分数。
Transformers 4.37 中文文档(九十九)(7)https://developer.aliyun.com/article/1564047