Step3 DCGAN (Deep Convolutional GAN)
Deep Convolutional GAN
Deep Convolutional Generative Adversarial Network
Alec Radford, Luke Metz, Soumith Chintala
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
[Paper] [code]
DCGAN,全称叫Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
为什么这样提出呢,我们从摘要可以看到,因为CNN在supervised learning 领域取得了非常了不起的成就(比如大规模的图片分类,目标检测等等),但是在unsupervised learning领域却没有特别大的进展。所以作者想弥补CNN在supervised 和 unsupervised之间的隔阂(gap)。作者提出了 将CNN和GAN相结合 的DCGAN,并展示了它在unsupervised learning所取得的不俗的成绩。作者通过在大量不同的image datasets上的训练,充分展示了DCGAN的generator(生成器)和discriminator(鉴别器)不论是在物体的组成部分(parts of object)还是场景方面(scenes)都学习到了丰富的层次表达(hierarchy representations)。作者还将学习到的特征应用于新的任务上(比如image classification),结果表明这些特征是非常好的通用图片表达(具有非常好的泛化能力)。
32 Filters, 5x5, Stride 1, Leaky ReLU(alpha=0.01)
Max Pool 2x2, Stride 2
64 Filters, 5x5, Stride 1, Leaky ReLU(alpha=0.01)
Max Pool 2x2, Stride 2
Fully Connected size 4 x 4 x 64, Leaky ReLU(alpha=0.01)
Fully Connected size 1
class build_dc_classifier(nn.Module): def __init__(self): super(build_dc_classifier, self).__init__() self.conv = nn.Sequential( nn.Conv2d(1, 32, 5, 1), nn.LeakyReLU(0.01), nn.MaxPool2d(2, 2), nn.Conv2d(32, 64, 5, 1), nn.LeakyReLU(0.01), nn.MaxPool2d(2, 2) ) self.fc = nn.Sequential( nn.Linear(1024, 1024), nn.LeakyReLU(0.01), nn.Linear(1024, 1) ) def forward(self, x): x = self.conv(x) x = x.view(x.shape[0], -1) x = self.fc(x) return x
Fully connected of size 1024, ReLU
Fully connected of size 7 x 7 x 128, ReLU
Reshape into Image Tensor
64 conv2dT ^T
filters of 4x4, stride 2, padding 1, ReLU
1 conv2dT ^T
filter of 4x4, stride 2, padding 1, TanH
这一部分的卷积生成网络是利用是卷积+上采样的方法来实现将100维的噪声输入z zz经过多层的卷积和上采样后得到的,在pytorch中,我们就是利用卷积转置,也就是我们的反卷积进行上采样的
class build_dc_generator(nn.Module): def __init__(self, noise_dim=NOISE_DIM): super(build_dc_generator, self).__init__() self.fc = nn.Sequential( nn.Linear(noise_dim, 1024), nn.ReLU(True), nn.BatchNorm1d(1024), nn.Linear(1024, 7 * 7 * 128), nn.ReLU(True), nn.BatchNorm1d(7 * 7 * 128) ) self.conv = nn.Sequential( nn.ConvTranspose2d(128, 64, 4, 2, padding=1), nn.ReLU(True), nn.BatchNorm2d(64), nn.ConvTranspose2d(64, 1, 4, 2, padding=1), nn.Tanh() ) def forward(self, x): x = self.fc(x) x = x.view(x.shape[0], 128, 7, 7) # reshape 通道是 128,大小是 7x7 x = self.conv(x) return x
def train_dc_gan(D_net, G_net, D_optimizer, G_optimizer, discriminator_loss, generator_loss, show_every=250, noise_size=96, num_epochs=10): iter_count = 0 for epoch in range(num_epochs): for x, _ in train_data: bs = x.shape[0] # 判别网络 real_data = x.cuda() # 真实数据 logits_real = D_net(real_data) # 判别网络得分 sample_noise = (torch.rand(bs, noise_size) - 0.5) / 0.5 # -1 ~ 1 的均匀分布 g_fake_seed = sample_noise.cuda() fake_images = G_net(g_fake_seed) # 生成的假的数据 logits_fake = D_net(fake_images) # 判别网络得分 d_total_error = discriminator_loss(logits_real, logits_fake) # 判别器的 loss D_optimizer.zero_grad() d_total_error.backward() D_optimizer.step() # 优化判别网络 # 生成网络 g_fake_seed = sample_noise.cuda() fake_images = G_net(g_fake_seed) # 生成的假的数据 gen_logits_fake = D_net(fake_images) g_error = generator_loss(gen_logits_fake) # 生成网络的 loss G_optimizer.zero_grad() g_error.backward() G_optimizer.step() # 优化生成网络 if (iter_count % show_every == 0): print('Iter: {}, D: {:.4}, G:{:.4}'.format(iter_count,, imgs_numpy = deprocess_img( show_images(imgs_numpy[0:16]) print() iter_count += 1
D_DC = build_dc_classifier().cuda() G_DC = build_dc_generator().cuda() D_DC_optim = get_optimizer(D_DC) G_DC_optim = get_optimizer(G_DC) train_dc_gan(D_DC, G_DC, D_DC_optim, G_DC_optim, discriminator_loss, generator_loss, num_epochs=5)
Iter: 0, D: 1.387, G:0.6381
Iter: 250, D: 0.7821, G:1.807
Iter: 1500, D: 1.216, G:0.7218
Iter: 1750, D: 1.143, G:1.092
可以看到,通过 DCGANs 能够得到更加清楚的结果,而且也可以更快地收敛