利用pytorch实现GAN(生成对抗网络)-MNIST图像-cs231n-assignment3

简介: 以后博客都在https://oldpan.me 中更新Generative Adversarial Networks(生成对抗网络)In 2014, Goodfellow et al.

以后博客都在https://oldpan.me 中更新

Generative Adversarial Networks(生成对抗网络)

In 2014, Goodfellow et al. presented a method for training generative models called Generative Adversarial Networks (GANs for short). In a GAN, we build two different neural networks. Our first network is a traditional classification network, called the discriminator. We will train the discriminator to take images, and classify them as being real (belonging to the training set) or fake (not present in the training set). Our other network, called the generator, will take random noise as input and transform it using a neural network to produce images. The goal of the generator is to fool the discriminator into thinking the images it produced are real.
在生成网络中,我们建立了两个神经网络。第一个网络是典型的分类神经网络,称为discriminator重点内容,我们训练这个网络对图像进行识别,以区别真假的图像(真的图片在训练集当中,而假的则不在。另一个网络称之为generator,它将随机的噪声作为输入,将其转化为使用神经网络训练出来产生出来的图像,它的目的是混淆discriminator使其认为它生成的图像是真的。

We can think of this back and forth process of the generator ( G ) trying to fool the discriminator ( D ), and the discriminator trying to correctly classify real vs. fake as a minimax game:

minimize G maximize D E x p data [ log D ( x ) ] + E z p ( z ) [ log ( 1 D ( G ( z ) ) ) ]

where z p ( z ) are the random noise samples, G ( z ) are the generated images using the neural network generator G , and D is the output of the discriminator, specifying the probability of an input being real. In Goodfellow et al., they analyze this minimax game and show how it relates to minimizing the Jensen-Shannon divergence between the training data distribution and the generated samples from G .

To optimize this minimax game, we will alternate between taking gradient descent steps on the objective for G , and gradient ascent steps on the objective for D :
1. update the generator ( G ) to minimize the probability of the discriminator making the correct choice.
2. update the discriminator ( D ) to maximize the probability of the discriminator making the correct choice.

While these updates are useful for analysis, they do not perform well in practice. Instead, we will use a different objective when we update the generator: maximize the probability of the discriminator making the incorrect choice. This small change helps to allevaiate problems with the generator gradient vanishing when the discriminator is confident. This is the standard update used in most GAN papers, and was used in the original paper from Goodfellow et al..

In this assignment, we will alternate the following updates:
在这项任务中,我们将轮换执行以下的更新:
1. Update the generator ( G ) to maximize the probability of the discriminator making the incorrect choice on generated data:
更新generator ( G )以最大化discriminator做出错误分类的概率。

maximize G E z p ( z ) [ log D ( G ( z ) ) ]

2. Update the discriminator ( D ), to maximize the probability of the discriminator making the correct choice on real and generated data:
更新discriminator ( D )以最大化discriminator做出正确分类的概率。
maximize D E x p data [ log D ( x ) ] + E z p ( z ) [ log ( 1 D ( G ( z ) ) ) ]

What else is there?

Since 2014, GANs have exploded into a huge research area, with massive workshops, and hundreds of new papers. Compared to other approaches for generative models, they often produce the highest quality samples but are some of the most difficult and finicky models to train (see this github repo that contains a set of 17 hacks that are useful for getting models working). Improving the stabiilty and robustness of GAN training is an open research question, with new papers coming out every day! For a more recent tutorial on GANs, see here. There is also some even more recent exciting work that changes the objective function to Wasserstein distance and yields much more stable results across model architectures: WGAN, WGAN-GP.

GANs are not the only way to train a generative model! For other approaches to generative modeling check out the deep generative model chapter of the Deep Learning book. Another popular way of training neural networks as generative models is Variational Autoencoders (co-discovered here and here). Variatonal autoencoders combine neural networks with variationl inference to train deep generative models. These models tend to be far more stable and easier to train but currently don’t produce samples that are as pretty as GANs.

Here’s an example of what your outputs from the 3 different models you’re going to train should look like… note that GANs are sometimes finicky, so your outputs might not look exactly like this… this is just meant to be a rough guideline of the kind of quality you can expect:

这里写图片描述

程序讲解:

1、加载所需要的模块和库,设定展示图片函数以及其他对图像预处理函数

import torch
import torch.nn as nn
from torch.nn import init
from torch.autograd import Variable
import torchvision
import torchvision.transforms as T
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.utils.data import sampler
import torchvision.datasets as dset

import numpy as np

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

def show_images(images):
    images = np.reshape(images, [images.shape[0], -1])  # images reshape to (batch_size, D)
    sqrtn = int(np.ceil(np.sqrt(images.shape[0])))
    sqrtimg = int(np.ceil(np.sqrt(images.shape[1])))

    fig = plt.figure(figsize=(sqrtn, sqrtn))
    gs = gridspec.GridSpec(sqrtn, sqrtn)
    gs.update(wspace=0.05, hspace=0.05)

    for i, img in enumerate(images):
        ax = plt.subplot(gs[i])
        plt.axis('off')
        ax.set_xticklabels([])
        ax.set_yticklabels([])
        ax.set_aspect('equal')
        plt.imshow(img.reshape([sqrtimg,sqrtimg]))
    return 

def preprocess_img(x):
    return 2 * x - 1.0

def deprocess_img(x):
    return (x + 1.0) / 2.0

def rel_error(x,y):
    return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))

def count_params(model):
    """Count the number of parameters in the current TensorFlow graph """
    param_count = np.sum([np.prod(p.size()) for p in model.parameters()])
    return param_count

answers = np.load('gan-checks-tf.npz')

采用的数据集

因为GANS中超参数的设置非常非常麻烦,同样也需要很多的训练epoch。为了加快训练速度,这里使用MNIST数据集,拥有60,000个训练集和10,000测试集。每个图片中包含一个数字(0-9,背景为黑色,数字为白色)。这个数据集通过标准神经网络的训练已经可以达到超过99%的准确率。

这里使用pytorch中自带的数据集工具进行对数据的提取:

# 采样函数为自己定义的序列采样(即按顺序采样)
class ChunkSampler(sampler.Sampler): 
    """Samples elements sequentially from some offset. 
    Arguments:
        num_samples: # of desired datapoints
        start: offset where we should start selecting from
    """
    def __init__(self, num_samples, start=0):
        self.num_samples = num_samples
        self.start = start

    def __iter__(self):
        return iter(range(self.start, self.start + self.num_samples))

    def __len__(self):
        return self.num_samples

NUM_TRAIN = 50000   # 训练集数量
NUM_VAL = 5000      # 测试集数量

NOISE_DIM = 96      
batch_size = 128

mnist_train = dset.MNIST('./cs231n/datasets/MNIST_data', train=True, download=True,
                           transform=T.ToTensor())
loader_train = DataLoader(mnist_train, batch_size=batch_size,
                          sampler=ChunkSampler(NUM_TRAIN, 0)) # 从0位置开始采样NUM_TRAIN个数

mnist_val = dset.MNIST('./cs231n/datasets/MNIST_data', train=True, download=True,
                           transform=T.ToTensor())
loader_val = DataLoader(mnist_val, batch_size=batch_size,
                        sampler=ChunkSampler(NUM_VAL, NUM_TRAIN)) # 从NUM_TRAIN位置开始采样NUM_VAL个数


imgs = loader_train.__iter__().next()[0].view(batch_size, 784).numpy().squeeze()
show_images(imgs)

这里写图片描述

Random Noise

Generate uniform noise from -1 to 1 with shape [batch_size, dim].
这里产生一个从-1 - 1的均匀噪声函数,形状为 [batch_size, dim].

def sample_noise(batch_size, dim):
    """
    Generate a PyTorch Tensor of uniform random noise.

    Input:
    - batch_size: Integer giving the batch size of noise to generate.
    - dim: Integer giving the dimension of noise to generate.

    Output:
    - A PyTorch Tensor of shape (batch_size, dim) containing uniform
      random noise in the range (-1, 1).
    """
    temp = torch.rand(batch_size, dim) + torch.rand(batch_size, dim)*(-1)

    return temp

接下来定义平铺函数和反平铺函数,用于对图像中数据的处理

class Flatten(nn.Module):
    def forward(self, x):
        N, C, H, W = x.size() # read in N, C, H, W
        return x.view(N, -1)  # "flatten" the C * H * W values into a single vector per image

class Unflatten(nn.Module):
    """
    An Unflatten module receives an input of shape (N, C*H*W) and reshapes it
    to produce an output of shape (N, C, H, W).
    """
    def __init__(self, N=-1, C=128, H=7, W=7):
        super(Unflatten, self).__init__()
        self.N = N
        self.C = C
        self.H = H
        self.W = W
    def forward(self, x):
        return x.view(self.N, self.C, self.H, self.W)

def initialize_weights(m):
    if isinstance(m, nn.Linear) or isinstance(m, nn.ConvTranspose2d):
        init.xavier_uniform(m.weight.data)

前期工作准备好了,开始写discriminator函数:

discriminator神经网络即去判断generator产生的图像是否为假,同时判断正确的图像是否为真,包含的网络层为:

Fully connected layer from size 784 to 256
LeakyReLU with alpha 0.01
Fully connected layer from 256 to 256
LeakyReLU with alpha 0.01
Fully connected layer from 256 to 1

我们使用LeakyRelu,设定其alpha参数为0.01
该判别器的输出应该为[batch_size, 1], 每个batch中包含正确分类.

def discriminator():
    """
    Build and return a PyTorch model implementing the architecture above.
    """
    model = nn.Sequential(
        Flatten(),
        nn.Linear(784,256),
        nn.LeakyReLU(0.01, inplace=True),
        nn.Linear(256,256),
        nn.LeakyReLU(0.01, inplace=True),
        nn.Linear(256,1)
    )
    return model

Generator

写生成网络:
Fully connected layer from noise_dim to 1024
ReLU
Fully connected layer with size 1024
ReLU
Fully connected layer with size 784
TanH(To clip the image to be [-1,1])

def generator(noise_dim=NOISE_DIM):
    """
    Build and return a PyTorch model implementing the architecture above.
    """
    model = nn.Sequential(
        nn.Linear(noise_dim, 1024),
        nn.ReLU(inplace=True),
        nn.Linear(1024, 1024),
        nn.ReLU(inplace=True),
        nn.Linear(1024, 784),
        nn.Tanh(),
    )
    return model

GAN Loss

Compute the generator and discriminator loss. The generator loss is:

G = E z p ( z ) [ log D ( G ( z ) ) ]

and the discriminator loss is:
D = E x p data [ log D ( x ) ] E z p ( z ) [ log ( 1 D ( G ( z ) ) ) ]

Note that these are negated from the equations presented earlier as we will be minimizing these losses.

HINTS: You should use the bce_loss function defined below to compute the binary cross entropy loss which is needed to compute the log probability of the true label given the logits output from the discriminator. Given a score s R and a label y { 0 , 1 } , the binary cross entropy loss is

b c e ( s , y ) = y log ( s ) + ( 1 y ) log ( 1 s )

A naive implementation of this formula can be numerically unstable, so we have provided a numerically stable implementation for you below.

You will also need to compute labels corresponding to real or fake and use the logit arguments to determine their size. Make sure you cast these labels to the correct data type using the global dtype variable, for example:

true_labels = Variable(torch.ones(size)).type(dtype)

Instead of computing the expectation, we will be averaging over elements of the minibatch, so make sure to combine the loss by averaging instead of summing.

上述描写中表明使用BCE_loss会导致不稳定的求导,这时使用BCEWithLogitsLoss()函数即可。

Bce_loss = nn.BCEWithLogitsLoss()

根据上面的描述写出生成器和判别器的损失函数:

def discriminator_loss(logits_real, logits_fake):
    """
    Computes the discriminator loss described above.

    Inputs:
    - logits_real: PyTorch Variable of shape (N,) giving scores for the real data.
    - logits_fake: PyTorch Variable of shape (N,) giving scores for the fake data.

    Returns:
    - loss: PyTorch Variable containing (scalar) the loss for the discriminator.
    """
    loss = None
    # Batch size.
    N = logits_real.size()

    # 目标label,全部设置为1意味着判别器需要做到的是将正确的全识别为正确,错误的全识别为错误
    true_labels = Variable(torch.ones(N)).type(dtype)


    real_image_loss = Bce_loss(logits_real, true_labels) # 识别正确的为正确
    fake_image_loss = bce_loss(logits_fake, 1 - true_labels) # 识别错误的为错误

    loss = real_image_loss + fake_image_loss 

    return loss

def generator_loss(logits_fake):
    """
    Computes the generator loss described above.

    Inputs:
    - logits_fake: PyTorch Variable of shape (N,) giving scores for the fake data.

    Returns:
    - loss: PyTorch Variable containing the (scalar) loss for the generator.
    """
    # Batch size.
    N = logits_fake.size()

    # 生成器的作用是将所有“假”的向真的(1)靠拢
    true_labels = Variable(torch.ones(N)).type(dtype)

    # 计算生成器损失
    loss = Bce_loss(logits_fake, true_labels)

    return loss

Optimizing our loss 对损失进行优化

使用 optim.Adam 优化器, 1e-3 learning rate, beta1=0.5, beta2=0.999.

def get_optimizer(model):
    """
    Construct and return an Adam optimizer for the model with learning rate 1e-3,
    beta1=0.5, and beta2=0.999.

    Input:
    - model: A PyTorch model that we want to optimize.

    Returns:
    - An Adam optimizer for the model with the desired hyperparameters.
    """
    optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.5, 0.999))
    return optimizer

定义训练函数

def run_a_gan(D, G, D_solver, G_solver, discriminator_loss, generator_loss, show_every=250, 
              batch_size=128, noise_size=96, num_epochs=10):
    """
    Train a GAN!

    Inputs:
    - D, G: pytorch模块,分别为判别器和生成器
    - D_solver, G_solver: torch.optim Optimizers to use for training the
      discriminator and generator.
    - discriminator_loss, generator_loss: Functions to use for computing the generator and
      discriminator loss, respectively.
    - show_every: Show samples after every show_every iterations.
    - batch_size: Batch size to use for training.
    - noise_size: Dimension of the noise to use as input to the generator.
    - num_epochs: Number of epochs over the training dataset to use for training.
    """
    iter_count = 0
    for epoch in range(num_epochs):
        for x, _ in loader_train:
            if len(x) != batch_size:
                continue
            D_solver.zero_grad()
            real_data = Variable(x).type(dtype)
            logits_real = D(2* (real_data - 0.5)).type(dtype)

            g_fake_seed = Variable(sample_noise(batch_size, noise_size)).type(dtype)
            fake_images = G(g_fake_seed).detach()
            logits_fake = D(fake_images.view(batch_size, 1, 28, 28))

            d_total_error = discriminator_loss(logits_real, logits_fake)
            d_total_error.backward()        
            D_solver.step()

            G_solver.zero_grad()
            g_fake_seed = Variable(sample_noise(batch_size, noise_size)).type(dtype)
            fake_images = G(g_fake_seed)

            gen_logits_fake = D(fake_images.view(batch_size, 1, 28, 28))
            g_error = generator_loss(gen_logits_fake)
            g_error.backward()
            G_solver.step()

            if (iter_count % show_every == 0):
                print('Iter: {}, D: {:.4}, G:{:.4}'.format(iter_count,d_total_error.data[0],g_error.data[0]))
                imgs_numpy = fake_images.data.cpu().numpy()
                show_images(imgs_numpy[0:16])
                plt.show()
                print()
            iter_count += 1

开始进行训练

# Make the discriminator
D = discriminator().type(dtype)

# Make the generator
G = generator().type(dtype)

# Use the function you wrote earlier to get optimizers for the Discriminator and the Generator
D_solver = get_optimizer(D)
G_solver = get_optimizer(G)
# Run it!
run_a_gan(D, G, D_solver, G_solver, discriminator_loss, generator_loss)

这里写图片描述
以上为最终迭代的结果

Least Squares GAN

We’ll now look at Least Squares GAN, a newer, more stable alernative to the original GAN loss function. For this part, all we have to do is change the loss function and retrain the model. We’ll implement equation (9) in the paper, with the generator loss:

G = 1 2 E z p ( z ) [ ( D ( G ( z ) ) 1 ) 2 ]

and the discriminator loss:
D = 1 2 E x p data [ ( D ( x ) 1 ) 2 ] + 1 2 E z p ( z ) [ ( D ( G ( z ) ) ) 2 ]

HINTS: Instead of computing the expectation, we will be averaging over elements of the minibatch, so make sure to combine the loss by averaging instead of summing. When plugging in for D ( x ) and D ( G ( z ) ) use the direct output from the discriminator (scores_real and scores_fake).

接下来使用给予最小二乘法损失的GAN来进行训练:
修改损失函数为最小二乘

def ls_discriminator_loss(scores_real, scores_fake):
    """
    Compute the Least-Squares GAN loss for the discriminator.

    Inputs:
    - scores_real: PyTorch Variable of shape (N,) giving scores for the real data.
    - scores_fake: PyTorch Variable of shape (N,) giving scores for the fake data.

    Outputs:
    - loss: A PyTorch Variable containing the loss.
    """
    N = scores_real.size()
#     print(N)

    true_labels = Variable(torch.ones(N)).type(dtype)

    fake_image_loss = (torch.mean((scores_real - true_labels)**2))
    real_image_loss = (torch.mean((scores_fake)**2))

    loss = 0.5*fake_image_loss + 0.5*real_image_loss

    return loss

def ls_generator_loss(scores_fake):
    """
    Computes the Least-Squares GAN loss for the generator.

    Inputs:
    - scores_fake: PyTorch Variable of shape (N,) giving scores for the fake data.

    Outputs:
    - loss: A PyTorch Variable containing the loss.
    """
    N = scores_fake.size()

    true_labels = Variable(torch.ones(N)).type(dtype)

    loss = 0.5 * ((torch.mean((scores_fake - true_labels)**2)))

    return loss

进行训练:

D_LS = discriminator().type(dtype)
G_LS = generator().type(dtype)

D_LS_solver = get_optimizer(D_LS)
G_LS_solver = get_optimizer(G_LS)

run_a_gan(D_LS, G_LS, D_LS_solver, G_LS_solver, ls_discriminator_loss, ls_generator_loss)

这里写图片描述

以上为最终迭代的效果,可以看到效果比之前好了一些,但是仍然有噪点。

Deeply Convolutional GANs

In the first part of the notebook, we implemented an almost direct copy of the original GAN network from Ian Goodfellow. However, this network architecture allows no real spatial reasoning. It is unable to reason about things like “sharp edges” in general because it lacks any convolutional layers. Thus, in this section, we will implement some of the ideas from DCGAN, where we use convolutional networks

Discriminator

We will use a discriminator inspired by the TensorFlow MNIST classification tutorial, which is able to get above 99% accuracy on the MNIST dataset fairly quickly.
* Reshape into image tensor (Use Unflatten!)
* 32 Filters, 5x5, Stride 1, Leaky ReLU(alpha=0.01)
* Max Pool 2x2, Stride 2
* 64 Filters, 5x5, Stride 1, Leaky ReLU(alpha=0.01)
* Max Pool 2x2, Stride 2
* Flatten
* Fully Connected size 4 x 4 x 64, Leaky ReLU(alpha=0.01)
* Fully Connected size 1

Generator

For the generator, we will copy the architecture exactly from the InfoGAN paper. See Appendix C.1 MNIST. See the documentation for tf.nn.conv2d_transpose. We are always “training” in GAN mode.
* Fully connected of size 1024, ReLU
* BatchNorm
* Fully connected of size 7 x 7 x 128, ReLU
* BatchNorm
* Reshape into Image Tensor
* 64 conv2d^T filters of 4x4, stride 2, ‘same’ padding, ReLU
* BatchNorm
* 1 conv2d^T filter of 4x4, stride 2, ‘same’ padding, TanH
* Should have a 28x28x1 image, reshape back into 784 vector

def build_dc_classifier():
    """
    Build and return a PyTorch model for the DCGAN discriminator implementing
    the architecture above.
    """
    return nn.Sequential(
        Unflatten(batch_size, 1, 28, 28),
        nn.Conv2d(1, 32,kernel_size=5, stride=1),
        nn.LeakyReLU(negative_slope=0.01),
        nn.MaxPool2d(2, stride=2),
        nn.Conv2d(32, 64,kernel_size=5, stride=1),
        nn.LeakyReLU(negative_slope=0.01),
        nn.MaxPool2d(kernel_size=2, stride=2),   
        Flatten(),
        nn.Linear(4*4*64, 4*4*64),
        nn.LeakyReLU(negative_slope=0.01),
        nn.Linear(4*4*64,1)
    )

data = Variable(loader_train.__iter__().next()[0]).type(dtype)
b = build_dc_classifier().type(dtype)
out = b(data)
print(out.size())

def build_dc_generator(noise_dim=NOISE_DIM):
    """
    Build and return a PyTorch model implementing the DCGAN generator using
    the architecture described above.
    """
    return nn.Sequential(
        nn.Linear(noise_dim, 1024),
        nn.ReLU(),
        nn.BatchNorm2d(1024),
        nn.Linear(1024, 7*7*128),
        nn.BatchNorm2d(7*7*128),
        Unflatten(batch_size, 128, 7, 7),
        nn.ConvTranspose2d(in_channels=128, out_channels=64, kernel_size=4, stride=2, padding=1),
        nn.ReLU(inplace=True),
        nn.BatchNorm2d(num_features=64),
        nn.ConvTranspose2d(in_channels=64, out_channels=1, kernel_size=4, stride=2, padding=1),
        nn.Tanh(),
        Flatten(),
    )

test_g_gan = build_dc_generator().type(dtype)
test_g_gan.apply(initialize_weights)

fake_seed = Variable(torch.randn(batch_size, NOISE_DIM)).type(dtype)
fake_images = test_g_gan.forward(fake_seed)
fake_images.size()
D_DC = build_dc_classifier().type(dtype) 
D_DC.apply(initialize_weights)
G_DC = build_dc_generator().type(dtype)
G_DC.apply(initialize_weights)

D_DC_solver = get_optimizer(D_DC)
G_DC_solver = get_optimizer(G_DC)

run_a_gan(D_DC, G_DC, D_DC_solver, G_DC_solver, discriminator_loss, generator_loss, num_epochs=5)

这里写图片描述
可以发现有卷积层的深层网络可以达到更好的效果。

目录
相关文章
|
3月前
|
机器学习/深度学习 PyTorch 算法框架/工具
PyTorch 中的动态计算图:实现灵活的神经网络架构
【8月更文第27天】PyTorch 是一款流行的深度学习框架,它以其灵活性和易用性而闻名。与 TensorFlow 等其他框架相比,PyTorch 最大的特点之一是支持动态计算图。这意味着开发者可以在运行时定义网络结构,这为构建复杂的模型提供了极大的便利。本文将深入探讨 PyTorch 中动态计算图的工作原理,并通过一些示例代码展示如何利用这一特性来构建灵活的神经网络架构。
273 1
|
11天前
|
机器学习/深度学习 TensorFlow 算法框架/工具
利用Python和TensorFlow构建简单神经网络进行图像分类
利用Python和TensorFlow构建简单神经网络进行图像分类
32 3
|
12天前
|
机器学习/深度学习 人工智能 自动驾驶
深度学习的奇迹:如何用神经网络识别图像
【10月更文挑战第33天】在这篇文章中,我们将探索深度学习的奇妙世界,特别是卷积神经网络(CNN)在图像识别中的应用。我们将通过一个简单的代码示例,展示如何使用Python和Keras库构建一个能够识别手写数字的神经网络。这不仅是对深度学习概念的直观介绍,也是对技术实践的一次尝试。让我们一起踏上这段探索之旅,看看数据、模型和代码是如何交织在一起,创造出令人惊叹的结果。
23 0
|
3月前
|
机器学习/深度学习 人工智能 PyTorch
【深度学习】使用PyTorch构建神经网络:深度学习实战指南
PyTorch是一个开源的Python机器学习库,特别专注于深度学习领域。它由Facebook的AI研究团队开发并维护,因其灵活的架构、动态计算图以及在科研和工业界的广泛支持而受到青睐。PyTorch提供了强大的GPU加速能力,使得在处理大规模数据集和复杂模型时效率极高。
192 59
|
2月前
|
机器学习/深度学习
小土堆-pytorch-神经网络-损失函数与反向传播_笔记
在使用损失函数时,关键在于匹配输入和输出形状。例如,在L1Loss中,输入形状中的N代表批量大小。以下是具体示例:对于相同形状的输入和目标张量,L1Loss默认计算差值并求平均;此外,均方误差(MSE)也是常用损失函数。实战中,损失函数用于计算模型输出与真实标签间的差距,并通过反向传播更新模型参数。
|
1月前
|
机器学习/深度学习 PyTorch API
深度学习入门:卷积神经网络 | CNN概述,图像基础知识,卷积层,池化层(超详解!!!)
深度学习入门:卷积神经网络 | CNN概述,图像基础知识,卷积层,池化层(超详解!!!)
|
2月前
|
机器学习/深度学习 自然语言处理 计算机视觉
用于图像和用于自然语言的神经网络区别
主要区别总结 数据结构:图像数据是二维像素矩阵,具有空间结构;文本数据是一维序列,具有时间结构。 网络架构:图像处理常用CNN,注重局部特征提取;自然语言处理常用RNN/LSTM/Transformer,注重序列和全局依赖。 操作单元:图像处理中的卷积核在空间上操作;自然语言处理中的注意力机制在序列上操作。
24 2
|
3月前
|
机器学习/深度学习 人工智能 编解码
【神经网络】基于对抗神经网络的图像生成是如何实现的?
对抗神经网络,尤其是生成对抗网络(GAN),在图像生成领域扮演着重要角色。它们通过一个有趣的概念——对抗训练——来实现图像的生成。以下将深入探讨GAN是如何实现基于对抗神经网络的图像生成的
37 3
|
3月前
|
机器学习/深度学习 PyTorch TensorFlow
【PyTorch】PyTorch深度学习框架实战(一):实现你的第一个DNN网络
【PyTorch】PyTorch深度学习框架实战(一):实现你的第一个DNN网络
162 1
|
3月前
|
机器学习/深度学习 PyTorch 测试技术
深度学习入门:使用 PyTorch 构建和训练你的第一个神经网络
【8月更文第29天】深度学习是机器学习的一个分支,它利用多层非线性处理单元(即神经网络)来解决复杂的模式识别问题。PyTorch 是一个强大的深度学习框架,它提供了灵活的 API 和动态计算图,非常适合初学者和研究者使用。
51 0