CTCLoss
CTC loss 理解_代码款款的博客-CSDN博客_ctc loss
计算连续(未分段)时间序列和目标序列之间的损失。CTCLoss 对输入与目标可能对齐的概率求和,生成一个相对于每个输入节点可微分的损失值。假定输入与目标的对齐方式为"多对一"
torch.nn.CTCLoss(blank=0, reduction='mean', zero_infinity=False)
参数:
Log_probs: Tensor of size (T, N, C)(T,N,C) or (T, C)(T,C), where T = \text{input length}T=input length, N = \text{batch size}N=batch size, and C = \text{number of classes (including blank)}C=number of classes (including blank). The logarithmized probabilities of the outputs (e.g. obtained with torch.nn.functional.log_softmax()).
Targets: Tensor of size (N, S)(N,S) or (\operatorname{sum}(\text{target_lengths}))(sum(target_lengths)), where N = \text{batch size}N=batch size and S = \text{max target length, if shape is } (N, S)S=max target length, if shape is (N,S). It represent the target sequences. Each element in the target sequence is a class index. And the target index cannot be blank (default=0). In the (N, S)(N,S) form, targets are padded to the length of the longest sequence, and stacked. In the (\operatorname{sum}(\text{target_lengths}))(sum(target_lengths)) form, the targets are assumed to be un-padded and concatenated within 1 dimension.
Input_lengths: Tuple or tensor of size (N)(N) or ()(), where N = \text{batch size}N=batch size. It represent the lengths of the inputs (must each be \leq T≤T). And the lengths are specified for each sequence to achieve masking under the assumption that sequences are padded to equal lengths.
Target_lengths: Tuple or tensor of size (N)(N) or ()(), where N = \text{batch size}N=batch size. It represent lengths of the targets. Lengths are specified for each sequence to achieve masking under the assumption that sequences are padded to equal lengths. If target shape is (N,S)(N,S), target_lengths are effectively the stop index s_ns**n for each target sequence, such that target_n = targets[n,0:s_n] for each target in a batch. Lengths must each be \leq S≤S If the targets are given as a 1d tensor that is the concatenation of individual targets, the target_lengths must add up to the total length of the tensor.
Output: scalar. If reduction is 'none', then (N)(N) if input is batched or ()() if input is unbatched, where N = \text{batch size}N=batch size.
使用:
# Target are to be padded T = 50 # Input sequence length C = 20 # Number of classes (including blank) N = 16 # Batch size S = 30 # Target sequence length of longest target in batch (padding length) S_min = 10 # Minimum target length, for demonstration purposes # Initialize random batch of input vectors, for *size = (T,N,C) input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_() # Initialize random batch of targets (0 = blank, 1:C = classes) target = torch.randint(low=1, high=C, size=(N, S), dtype=torch.long) input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long) target_lengths = torch.randint(low=S_min, high=S, size=(N,), dtype=torch.long) ctc_loss = nn.CTCLoss() loss = ctc_loss(input, target, input_lengths, target_lengths) loss.backward() # Target are to be un-padded T = 50 # Input sequence length C = 20 # Number of classes (including blank) N = 16 # Batch size # Initialize random batch of input vectors, for *size = (T,N,C) input = torch.randn(T, N, C).log_softmax(2).detach().requires_grad_() input_lengths = torch.full(size=(N,), fill_value=T, dtype=torch.long) # Initialize random batch of targets (0 = blank, 1:C = classes) target_lengths = torch.randint(low=1, high=T, size=(N,), dtype=torch.long) target = torch.randint(low=1, high=C, size=(sum(target_lengths),), dtype=torch.long) ctc_loss = nn.CTCLoss() loss = ctc_loss(input, target, input_lengths, target_lengths) loss.backward() # Target are to be un-padded and unbatched (effectively N=1) T = 50 # Input sequence length C = 20 # Number of classes (including blank) # Initialize random batch of input vectors, for *size = (T,C) input = torch.randn(T, C).log_softmax(2).detach().requires_grad_() input_lengths = torch.tensor(T, dtype=torch.long) # Initialize random batch of targets (0 = blank, 1:C = classes) target_lengths = torch.randint(low=1, high=T, size=(), dtype=torch.long) target = torch.randint(low=1, high=C, size=(target_lengths,), dtype=torch.long) ctc_loss = nn.CTCLoss() loss = ctc_loss(input, target, input_lengths, target_lengths) loss.backward()
NLLLoss
详解torch.nn.NLLLOSS - 知乎 (zhihu.com)
log_softmax与softmax的区别在哪里? - 知乎 (zhihu.com)
使用:
m = nn.LogSoftmax(dim=1) loss = nn.NLLLoss() # input is of size N x C = 3 x 5 input = torch.randn(3, 5, requires_grad=True) # each element in target has to have 0 <= value < C target = torch.tensor([1, 0, 4]) output = loss(m(input), target) output.backward() # 2D loss example (used, for example, with image inputs) N, C = 5, 4 loss = nn.NLLLoss() # input is of size N x C x height x width data = torch.randn(N, 16, 10, 10) conv = nn.Conv2d(16, C, (3, 3)) m = nn.LogSoftmax(dim=1) # each element in target has to have 0 <= value < C target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, C) output = loss(m(conv(data)), target) output.backward()
PoissonNLLLoss
目标泊松分布的负对数似然损失。
使用:
loss = nn.PoissonNLLLoss() log_input = torch.randn(5, 2, requires_grad=True) target = torch.randn(5, 2) output = loss(log_input, target) output.backward()
GAUSSIANNLLLOSS
真实标签服从高斯分布的负对数似然损失,神经网络的输出作为高斯分布的均值和方差。
对于包含[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-q4kc1uHT-1664029582934)(https://math.jianshu.com/math?formula=N)]个样本的batch数据 [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-h83QFnzx-1664029582935)(https://math.jianshu.com/math?formula=D(x%2C%20var%2C%20y)]),[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-HqpRvEU0-1664029582936)(https://math.jianshu.com/math?formula=x)]神经网络的输出,作为高斯分布的均值,[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rx7lIhr4-1664029582937)(https://math.jianshu.com/math?formula=var)]神经网络的输出,作为高斯分布的方差,[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-j8SxYO5A-1664029582939)(https://math.jianshu.com/math?formula=y)]是样本对应的标签,服从高斯分布。[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-5TpBJomt-1664029582940)(https://math.jianshu.com/math?formula=x)]与[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-i9xZLG0R-1664029582941)(https://math.jianshu.com/math?formula=y)]的维度相同,[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-J9stE5rL-1664029582942)(https://math.jianshu.com/math?formula=var)]和[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xavLHkx4-1664029582943)(https://math.jianshu.com/math?formula=x)]的维度相同,或者最后一个维度不同且最后一个维度为1,可以进行broadcast。
参考链接:
loss函数之PoissonNLLLoss,GaussianNLLLoss - 简书 (jianshu.com)
使用:
loss = nn.GaussianNLLLoss() input = torch.randn(5, 2, requires_grad=True) target = torch.randn(5, 2) var = torch.ones(5, 2, requires_grad=True) #heteroscedastic output = loss(input, target, var) output.backward()
KLDIVLOSS
KL散度,又叫相对熵,用于衡量两个分布(离散分布和连续分布)之间的距离。
设[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4ZRB6KpE-1664029582944)(https://math.jianshu.com/math?formula=p(x)]) 、[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WFmovzOi-1664029582945)(https://math.jianshu.com/math?formula=q(x)]) 是离散随机变量[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Xle4sPs0-1664029582947)(https://math.jianshu.com/math?formula=X)]的两个概率分布,则[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-A3Ob8twF-1664029582948)(https://math.jianshu.com/math?formula=p)] 对[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-DsIYvCCP-1664029582948)(https://math.jianshu.com/math?formula=q)] 的KL散度是:
参考链接:
loss函数之KLDivLoss - 简书 (jianshu.com)
使用:
kl_loss = nn.KLDivLoss(reduction="batchmean") # input should be a distribution in the log space input = F.log_softmax(torch.randn(3, 5, requires_grad=True)) # Sample a batch of distributions. Usually this would come from the dataset target = F.softmax(torch.rand(3, 5)) output = kl_loss(input, target) kl_loss = nn.KLDivLoss(reduction="batchmean", log_target=True) log_target = F.log_softmax(torch.rand(3, 5)) output = kl_loss(input, log_target)
BCELOSS
loss函数之BCELoss - 简书 (jianshu.com)
使用:
m = nn.Sigmoid() loss = nn.BCELoss() input = torch.randn(3, requires_grad=True) target = torch.empty(3).random_(2) output = loss(m(input), target) output.backward()
BCEWITHLOGITSLOSS
这个东西,本质上和nn.BCELoss()没有区别,只是在BCELoss上加了个logits函数(也就是sigmoid函数)
使用:
loss = nn.BCEWithLogitsLoss() input = torch.randn(3, requires_grad=True) target = torch.empty(3).random_(2) output = loss(input, target) output.backward()
MARGINRANKINGLOSS
loss函数之MarginRankingLoss - 简书 (jianshu.com)
使用:
loss = nn.MarginRankingLoss() input1 = torch.randn(3, requires_grad=True) input2 = torch.randn(3, requires_grad=True) target = torch.randn(3).sign() output = loss(input1, input2, target) output.backward()
HingeEmbeddingLoss
用于判断两个向量是否相似,输入是两个向量之间的距离。 常用于非线性词向量学习以及半监督学习。
loss函数之CosineEmbeddingLoss,HingeEmbeddingLoss_ltochange的博客-CSDN博客_余弦相似度损失函数