一、Convolution Layers
1.1 nn.Conv2d
功能:对多个二维信号进行二维卷积
主要参数:
• in_channels:输入通道数
• out_channels:输出通道数,等价于卷
积核个数
• kernel_size:卷积核尺寸
• stride:步长
• padding :填充个数
• dilation:空洞卷积大小
• groups:分组卷积设置
• bias:偏置
nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
eg.
# With square kernels and equal stride m = nn.Conv2d(16, 33, 3, stride=2) # non-square kernels and unequal stride and with padding m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2)) # non-square kernels and unequal stride and with padding and dilation m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1)) input = torch.randn(20, 16, 50, 100) output = m(input)
1.2 nn.ConvTranspose2d转置卷积
功能:转置卷积实现上采样
主要参数:
主要参数:
• in_channels:输入通道数
• out_channels:输出通道数
• kernel_size:卷积核尺寸
• stride:步长
• padding :填充个数
• dilation:空洞卷积大小
• groups:分组卷积设置
• bias:偏置
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros')
eg.
# With square kernels and equal stride m = nn.ConvTranspose2d(16, 33, 3, stride=2) # non-square kernels and unequal stride and with padding m = nn.ConvTranspose2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2)) input = torch.randn(20, 16, 50, 100) output = m(input) # exact output size can be also specified as an argument input = torch.randn(1, 16, 12, 12) downsample = nn.Conv2d(16, 16, 3, stride=2, padding=1) upsample = nn.ConvTranspose2d(16, 16, 3, stride=2, padding=1) h = downsample(input) h.size() output = upsample(h, output_size=input.size()) output.size()
二、Pooling Layer
2.1 nn.MaxPool2d
功能:对二维信号(图像)进行最大值池化
主要参数:
• kernel_size:池化核尺寸
• stride:步长
• padding :填充个数
• dilation:池化核间隔大小
• ceil_mode:尺寸向上取整
• return_indices:记录池化像素索引
nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
eg.
# pool of square window of size=3, stride=2 m = nn.MaxPool2d(3, stride=2) # pool of non-square window m = nn.MaxPool2d((3, 2), stride=(2, 1)) input = torch.randn(20, 16, 50, 32) output = m(input)
2.2 nn.AvgPool2d
功能:对二维信号(图像)进行平均值池化
主要参数:
• kernel_size:池化核尺寸
• stride:步长
• padding :填充个数
• ceil_mode:尺寸向上取整
• count_include_pad:填充值用于计算
• divisor_override :除法因子
nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
eg.
# pool of square window of size=3, stride=2 m = nn.AvgPool2d(3, stride=2) # pool of non-square window m = nn.AvgPool2d((3, 2), stride=(2, 1)) input = torch.randn(20, 16, 50, 32) output = m(input)
2.3 nn.MaxUnpool2d
功能:对二维信号(图像)进行最大值池化
上采样 (反卷积)
主要参数:
• kernel_size:池化核尺寸
• stride:步长
• padding :填充个数
nn.MaxUnpool2d(kernel_size, stride=None, padding=0) forward(self, input, indices, output_size=None)
eg.
pool = nn.MaxPool2d(2, stride=2, return_indices=True) unpool = nn.MaxUnpool2d(2, stride=2) input = torch.tensor([[[[ 1., 2., 3., 4.], output, indices = pool(input) unpool(output, indices) # Now using output_size to resolve an ambiguous size for the inverse input = torch.torch.tensor([[[[ 1., 2., 3., 4., 5.], output, indices = pool(input) # This call will not work without specifying output_size unpool(output, indices, output_size=input.size())
三、Linear Layer
3.1 nn.Linear
功能:对一维信号(向量)进行线性组合
主要参数:
• in_features:输入结点数
• out_features:输出结点数
• bias :是否需要偏置
计算公式:y = 𝒙𝑾𝑻 + 𝒃𝒊𝒂𝒔
nn.Linear(in_features, out_features, bias=True)
eg.
m = nn.Linear(20, 30) input = torch.randn(128, 20) output = m(input) print(output.size())
四、Activation Layer
4.1 nn.Sigmoid
公式:
图像:
eg.
m = nn.Sigmoid() input = torch.randn(2) output = m(input)
4.2 nn.tanh
公式:
图像:
eg.
m = nn.Tanh() input = torch.randn(2) output = m(input)
4.3 nn.ReLU
公式:
图像:
eg.
>>> m = nn.ReLU() >>> input = torch.randn(2) >>> output = m(input)
4.4 nn.LeakyReLU
公式:
m = nn.LeakyReLU(0.1) input = torch.randn(2) output = m(input)
4.5 nn.ELU
公式:
图像:
eg.
m = nn.ELU() input = torch.randn(2) output = m(input)