前言

最近有粉丝找到我问了三个关于深度学习的问题，也算是小作业吧，做完之后我便写下这篇文章分享给大家，适合初学或有一定基础到小伙伴练手，我的答案仅供参考学习，如有疑问或建议欢迎提出！

问题一

请设计一个多层感知机(mlp)网络。

1. 该网络执行如下操作：

- 将输入32×32的图像拉伸为1×1024;

- 将拉伸后的数据传入第一个隐藏层，该隐藏层为全连接层，包含2048个隐藏单元，并使用Sigmoid激活函数;

- 将第一个隐藏层的输出传入第二个隐藏层，第二个隐藏层为全连接层，包含512个隐藏单元，使用ReLU激活函数;

- 将第二个隐藏层的输出传入最后一层，最后一层也为全连接层，输出20维特征，不使用激活函数。

2. 该网络的全连接层权重初始化方案为：全连接层权重服从[0,1]区间上的均匀分布(uniform); 全连接层偏差为常值0.

最终输出要求：

# 代码块的输出需与下面结果保持一致：
"""
Flatten output shape:    torch.Size([1, 1024])
Linear output shape:   torch.Size([1, 2048])
   Linear weight's mean:   tensor(0.8631)
   Linear bias's mean:   tensor(0.)
Sigmoid output shape:    torch.Size([1, 2048])
Linear output shape:   torch.Size([1, 512])
   Linear weight's mean:   tensor(0.0675)
   Linear bias's mean:   tensor(0.)
ReLU output shape:   torch.Size([1, 512])
Linear output shape:   torch.Size([1, 20])
   Linear weight's mean:   tensor(0.2539)
   Linear bias's mean:   tensor(0.)
"""

参考答案：

# 导入需要的torch及相关库
import torch
from torch import nn
import torch.nn.functional as F
# 打印每一层大小信息函数 print_layer_info()
def print_layer_info(net,X):
    for layer in net:
        X = layer(X)
        print(layer.__class__.__name__,'output shape: \t',X.shape)
        if type(layer) == nn.Linear:
            print('\t',layer.__class__.__name__,'weight\'s mean: \t',torch.mean(layer.weight[0][0].data))
            print('\t',layer.__class__.__name__,'bias\'s mean: \t',torch.mean(layer.bias[0].data))
# 你设计的网络,网络名为net
net = nn.Sequential(
        # 将输入32×32的图像拉伸为1×1024将拉伸后的数据传入第一个隐藏层，该隐藏层为全连接层，包含2048个隐藏单元，并使用Sigmoid激活函数;
        nn.Flatten(),
        nn.Linear(in_features=1024, out_features=2048),
        nn.Sigmoid(),
        # 将第一个隐藏层的输出传入第二个隐藏层，第二个隐藏层为全连接层，包含512个隐藏单元，使用ReLU激活函数
        nn.Linear(in_features=2048, out_features=512),
        nn.ReLU(),
        # 将第二个隐藏层的输出传入最后一层，最后一层也为全连接层，输出20维特征，不使用激活函数
        nn.Linear(in_features=512, out_features=20)
)
# 在这里按要求将网络权重初始化
for layer in net:
    if isinstance(layer, nn.Linear):
        nn.init.uniform_(layer.weight, 0, 1)
        nn.init.constant_(layer.bias, 0)
# 测试网络net是否按要求定义
X = torch.rand(size=(1, 1, 32, 32), dtype=torch.float32)
print_layer_info(net,X)

输出结果：

Flatten output shape:    torch.Size([1, 1024])
Linear output shape:   torch.Size([1, 2048])
   Linear weight's mean:   tensor(0.8667)
   Linear bias's mean:   tensor(0.)
Sigmoid output shape:    torch.Size([1, 2048])
Linear output shape:   torch.Size([1, 512])
   Linear weight's mean:   tensor(0.9883)
   Linear bias's mean:   tensor(0.)
ReLU output shape:   torch.Size([1, 512])
Linear output shape:   torch.Size([1, 20])
   Linear weight's mean:   tensor(0.7355)
   Linear bias's mean:   tensor(0.)

注：由于X是随机生成的，所以有些数值不可能完全跟要求一模一样。

问题二

请设计一个卷积神经网络(CNN)。

1. 该网络执行如下操作：

- 第一层：使用96个大小为11×11、步长为4、填充为2的卷积核，将输入3×224×224的图像输出为96×55×55的图像，使用ReLU为激活函数;

- 第二层：大小为3×3、步长为2、无填充的极大值池化层，将96×55×55的图像输出为96×27×27;

- 第三层：使用256个大小为5×5、步长为1、填充为2的卷积核，将输入96×27×27的图像输出为256×27×27的图像，使用ReLU为激活函数;

- 第四层：大小为3×3、步长为2、无填充的极大值池化层，将256×27×27的图像输出为256×13×13;

- 第五层：使用384个大小为3×3、步长为1、填充为1的卷积核，将输入256×13×13的图像输出为384×13×13的图像，使用ReLU为激活函数;

- 第六层：使用256个大小为3×3、步长为1、填充为1的卷积核，将输入384×13×13的图像输出为256×13×13的图像，使用ReLU为激活函数;

- 第七层：大小为3×3、步长为1、填充为1的极大值池化层，将256×13×13的图像输出为256×13×13;

- 第八层：将上一层的输入拉伸为行向量;

- 第九层：全连接层：将上一层拉伸后的向量变成4096维向量;使用ReLU为激活函数;

- 第十层：全连接层：将上一层输出得向量变成1000维向量;

最后输出要求：

# 代码块的输出需与下面结果保持一致：
"""
Conv2d output shape:   torch.Size([1, 96, 55, 55])
ReLU output shape:   torch.Size([1, 96, 55, 55])
MaxPool2d output shape:    torch.Size([1, 96, 27, 27])
Conv2d output shape:   torch.Size([1, 256, 27, 27])
ReLU output shape:   torch.Size([1, 256, 27, 27])
MaxPool2d output shape:    torch.Size([1, 256, 13, 13])
Conv2d output shape:   torch.Size([1, 384, 13, 13])
ReLU output shape:   torch.Size([1, 384, 13, 13])
Conv2d output shape:   torch.Size([1, 256, 13, 13])
ReLU output shape:   torch.Size([1, 256, 13, 13])
MaxPool2d output shape:    torch.Size([1, 256, 13, 13])
Flatten output shape:    torch.Size([1, 43264])
Linear output shape:   torch.Size([1, 4096])
ReLU output shape:   torch.Size([1, 4096])
Linear output shape:   torch.Size([1, 1000])
"""

参考答案：

import torch
import torch.nn as nn
# 你设计的网络,网络名为net
net = nn.Sequential(
    # layer 1
    nn.Conv2d(in_channels=3, out_channels=96, kernel_size=11, stride=4, padding=2),
    nn.ReLU(),
    # layer 2
    nn.MaxPool2d(kernel_size=3, stride=2),
    # layer 3
    nn.Conv2d(in_channels=96, out_channels=256, kernel_size=5, stride=1, padding=2),
    nn.ReLU(),
    # layer 4
    nn.MaxPool2d(kernel_size=3, stride=2),
    # layer 5
    nn.Conv2d(in_channels=256, out_channels=384, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    # layer 6
    nn.Conv2d(in_channels=384, out_channels=256, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    # layer 7
    nn.MaxPool2d(kernel_size=3, stride=2),
    # layer 8
    nn.Flatten(),
    # layer 9
    nn.Linear(in_features=9216, out_features=4096),
    nn.ReLU(),
    # layer 10
    nn.Linear(in_features=4096, out_features=1000)
)
# 检查设计的网络net是否符合要求：
X = torch.randn(1, 3, 224, 224)
for layer in net:
    X = layer(X)
    print(layer.__class__.__name__,'output shape: \t',X.shape)

问题三

nn.MaxPool2d()函数通过对图像进行最大值下采样，可以将图像尺寸便小。与之相对应得nn.MaxUnpool2d()函数可以实现对图像进行上采样，将图像尺寸变大。

请查询nn.MaxUnpool2d()函数的使用手册，并设计一个先降维后升维的卷积神经网络(u-net)。

1. 该网络执行如下 **降维** 操作：

- 第一层：使用8个大小为3×3、步长为1、填充为1的卷积核，将输入1×224×224的图像输出为8×224×224的图像，使用ReLU为激活函数;

- 第二层：大小为2×2、步长为2、无填充的极大值池化层，将8×224×224的图像输出为8×112×112;

- 第三层：使用16个大小为3×3、步长为1、填充为1的卷积核，将输入8×112×112的图像输出为16×112×112的图像，使用ReLU为激活函数;

- 第四层：大小为2×2、步长为2、无填充的极大值池化层，将16×112×112的图像输出为16×56×56;

- 第五层：使用32个大小为3×3、步长为1、填充为1的卷积核，将输入16×56×56的图像输出为32×56×56的图像，使用ReLU为激活函数;

- 第六层：大小为2×2、步长为2、填充为1的极大值池化层，将32×56×56的图像输出为32×28×28;

2. 该网络接着执行如下 **升维** 操作：

- 第七层：大小为2×2、步长为2、填充为1的**反极大值池化层**，将32×28×28的图像输出为32×56×56;

- 第八层：使用16个大小为3×3、步长为1、填充为1的卷积核，将输入32×56×56的图像输出为16×56×56的图像，使用ReLU为激活函数;

- 第九层：大小为2×2、步长为2、填充为1的**反极大值池化层**，将16×56×56的图像输出为16×112×112;

- 第十层：使用8个大小为3×3、步长为1、填充为1的卷积核，将输入16×112×112的图像输出为8×112×112的图像，使用ReLU为激活函数;

- 第十一层：大小为2×2、步长为2、填充为1的**反极大值池化层**，将8×112×112的图像输出为8×224×224;

- 第十二层：使用1个大小为3×3、步长为1、填充为1的卷积核，将输入8×224×224的图像输出为1×224×224的图像，使用ReLU为激活函数;

最终输出要求：

# 代码块的输出需与下面结果保持一致：
"""
降维过程：
Conv2d output shape:   torch.Size([1, 8, 224, 224])
ReLU output shape:   torch.Size([1, 8, 224, 224])
MaxPool2d_Indices output shape:    torch.Size([1, 8, 112, 112])
Conv2d output shape:   torch.Size([1, 16, 112, 112])
ReLU output shape:   torch.Size([1, 16, 112, 112])
MaxPool2d_Indices output shape:    torch.Size([1, 16, 56, 56])
Conv2d output shape:   torch.Size([1, 32, 56, 56])
ReLU output shape:   torch.Size([1, 32, 56, 56])
MaxPool2d_Indices output shape:    torch.Size([1, 32, 28, 28])
升维过程：
MaxUnpool2d_Indices output shape:    torch.Size([1, 32, 56, 56])
Conv2d output shape:   torch.Size([1, 16, 56, 56])
ReLU output shape:   torch.Size([1, 16, 56, 56])
MaxUnpool2d_Indices output shape:    torch.Size([1, 16, 112, 112])
Conv2d output shape:   torch.Size([1, 8, 112, 112])
ReLU output shape:   torch.Size([1, 8, 112, 112])
MaxUnpool2d_Indices output shape:    torch.Size([1, 8, 224, 224])
Conv2d output shape:   torch.Size([1, 1, 224, 224])
ReLU output shape:   torch.Size([1, 1, 224, 224])
"""

参考答案：

import torch
import torch.nn as nn
# 定义down_net按要求实现图像卷积和下采样
down_net = nn.Sequential(
    nn.Conv2d(1, 8, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(8, 16, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2, stride=2)
)
# 定义up_net按要求实现图像上采样和卷积
up_net = nn.Sequential(
    nn.ConvTranspose2d(32, 32, kernel_size=2, stride=2),
    nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.ConvTranspose2d(16, 16, kernel_size=2, stride=2),
    nn.Conv2d(16, 8, kernel_size=3, stride=1, padding=1),
    nn.ReLU(),
    nn.ConvTranspose2d(8, 8, kernel_size=2, stride=2),
    nn.Conv2d(8, 1, kernel_size=3, stride=1, padding=1),
    nn.ReLU()
)
# 测试网络down_net是否符合按要求将图像进行下采样：
X = torch.randn(1, 1, 224, 224) #测试输入数据
print('降维过程：')
for layer in down_net:
    X = layer(X)
    print(layer.__class__.__name__,'output shape: \t',X.shape)
# 测试网络up_net是否按要求将图像进行上采样：
print('升维过程：')
for layer in up_net:
    X = layer(X)
    print(layer.__class__.__name__,'output shape: \t',X.shape)

输出结果：

降维过程：
Conv2d output shape:   torch.Size([1, 8, 224, 224])
ReLU output shape:   torch.Size([1, 8, 224, 224])
MaxPool2d output shape:    torch.Size([1, 8, 112, 112])
Conv2d output shape:   torch.Size([1, 16, 112, 112])
ReLU output shape:   torch.Size([1, 16, 112, 112])
MaxPool2d output shape:    torch.Size([1, 16, 56, 56])
Conv2d output shape:   torch.Size([1, 32, 56, 56])
ReLU output shape:   torch.Size([1, 32, 56, 56])
MaxPool2d output shape:    torch.Size([1, 32, 28, 28])
升维过程：
ConvTranspose2d output shape:    torch.Size([1, 32, 56, 56])
Conv2d output shape:   torch.Size([1, 16, 56, 56])
ReLU output shape:   torch.Size([1, 16, 56, 56])
ConvTranspose2d output shape:    torch.Size([1, 16, 112, 112])
Conv2d output shape:   torch.Size([1, 8, 112, 112])
ReLU output shape:   torch.Size([1, 8, 112, 112])
ConvTranspose2d output shape:    torch.Size([1, 8, 224, 224])
Conv2d output shape:   torch.Size([1, 1, 224, 224])
ReLU output shape:   torch.Size([1, 1, 224, 224])

分享3个深度学习练手的小案例

前言

问题一

问题二

问题三

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

分享3个深度学习练手的小案例

前言

问题一

问题二

问题三

热门文章

最新文章

相关课程

相关电子书

相关实验场景