AlexNet在我之前的博客中已经做过详解,详情见:
https://blog.csdn.net/muye_IT/article/details/123602605?spm=1001.2014.3001.5501
代码已提交github,详情见(麻烦Star!):
https://github.com/Jasper0420/Deep-Learning-Practice-AlexNet
1. 数据集介绍
花分类数据集 flower_data下载:http://download.tensorflow.org/example_images/flower_photos.tgz(复制打开)
- flower_photos(解压的数据集文件夹,3670个样本)
- rain(生成的训练集,3306个样本)
- val(生成的验证集,364个样本)
如何吧数据集分为训练集和验证集?
使用步骤如下:
(1)在data_set文件夹下创建新文件夹"flower_data"
(2)点击链接下载花分类数据集 http://download.tensorflow.org/example_images/flower_photos.tgz(复制打开链接)
(3)解压数据集到flower_data文件夹下
(4)执行"split_data.py"脚本自动将数据集划分成训练集train和验证集val
split_data.py的代码如下,在用到自己的数据集时,修改代码中的文件路径即可。
import os from shutil import copy import random def mkfile(file): if not os.path.exists(file): os.makedirs(file) # 获取 flower_photos 文件夹下除 .txt 文件以外所有文件夹名(即5种花的类名) file_path = 'flower_data/flower_photos' flower_class = [cla for cla in os.listdir(file_path) if ".txt" not in cla] # 创建 训练集train 文件夹,并由5种类名在其目录下创建5个子目录 mkfile('flower_data/train') for cla in flower_class: mkfile('flower_data/train/'+cla) # 创建 验证集val 文件夹,并由5种类名在其目录下创建5个子目录 mkfile('flower_data/val') for cla in flower_class: mkfile('flower_data/val/'+cla) # 划分比例,训练集 : 验证集 = 9 : 1 split_rate = 0.1 # 遍历5种花的全部图像并按比例分成训练集和验证集 for cla in flower_class: cla_path = file_path + '/' + cla + '/' # 某一类别花的子目录 images = os.listdir(cla_path) # iamges 列表存储了该目录下所有图像的名称 num = len(images) eval_index = random.sample(images, k=int(num*split_rate)) # 从images列表中随机抽取 k 个图像名称 for index, image in enumerate(images): # eval_index 中保存验证集val的图像名称 if image in eval_index: image_path = cla_path + image new_path = 'flower_data/val/' + cla copy(image_path, new_path) # 将选中的图像复制到新路径 # 其余的图像保存在训练集train中 else: image_path = cla_path + image new_path = 'flower_data/train/' + cla copy(image_path, new_path) print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="") # processing bar print() print("processing done!")
2. AlexNet网络介绍
AlexNet在我之前的博客中已经做过详解,详情见:https://blog.csdn.net/muye_IT/article/details/123602605?spm=1001.2014.3001.5501
AlexNet是在LeNet的基础上加深了网络的结构,学习更丰富更高维的图像特征。AlexNet的特点:
1.提出了一种卷积层加全连接层的卷积神经网络结构。
2.首次使用ReLU函数做为神经网络的激活函数。
3.首次提出Dropout正则化来控制过拟合。
4.使用加入动量的小批量梯度下降算法加速了训练过程的收敛。
5.使用数据增强策略极大地抑制了训练过程的过拟合。
6.利用了GPU的并行计算能力,加速了网络的训练与推断。
AlexNet网络共有:卷积层 5个,池化层 3个,局部响应归一化层:2个,全连接层:3个。
层数统计说明:
AlexNet共8层: 5个卷积层(CONV1——CONV5) 3个全连接层(FC6-FC8)
- ➢ 计算网络层数时仅统计卷积层与全连接层;
- ➢ 池化层与各种归一化层都是对它们前面卷积层输出的特征图进行后处理,不单独算作一层。
3. model.py实现
需要注意的是:
原论文中用的双GPU,我的电脑只有一块GPU,代码只使用了一半的网络参数,相当于只用了原论文中网络结构的下半部分,但是如果使用完整网络跑一遍,发现一半参数跟完整参数的训练结果精度相差无几。
import torch.nn as nn import torch class AlexNet(nn.Module): def __init__(self, num_classes=1000, init_weights=False): super(AlexNet, self).__init__() # 用nn.Sequential()将网络打包成一个模块,精简代码 self.features = nn.Sequential( # 卷积层提取图像特征 nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2), # input[3, 224, 224] output[48, 55, 55] nn.ReLU(inplace=True), # 直接修改覆盖原值,节省运算内存 nn.MaxPool2d(kernel_size=3, stride=2), # output[48, 27, 27] nn.Conv2d(48, 128, kernel_size=5, padding=2), # output[128, 27, 27] nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=3, stride=2), # output[128, 13, 13] nn.Conv2d(128, 192, kernel_size=3, padding=1), # output[192, 13, 13] nn.ReLU(inplace=True), nn.Conv2d(192, 192, kernel_size=3, padding=1), # output[192, 13, 13] nn.ReLU(inplace=True), nn.Conv2d(192, 128, kernel_size=3, padding=1), # output[128, 13, 13] nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=3, stride=2), # output[128, 6, 6] ) self.classifier = nn.Sequential( # 全连接层对图像分类 nn.Dropout(p=0.5), # Dropout 随机失活神经元,默认比例为0.5 nn.Linear(128 * 6 * 6, 2048), nn.ReLU(inplace=True), nn.Dropout(p=0.5), nn.Linear(2048, 2048), nn.ReLU(inplace=True), nn.Linear(2048, num_classes), ) if init_weights: self._initialize_weights() # 前向传播过程 def forward(self, x): x = self.features(x) x = torch.flatten(x, start_dim=1) # 展平后再传入全连接层 x = self.classifier(x) return x # 网络权重初始化,实际上 pytorch 在构建网络时会自动初始化权重 def _initialize_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2d): # 若是卷积层 nn.init.kaiming_normal_(m.weight, mode='fan_out', # 用(何)kaiming_normal_法初始化权重 nonlinearity='relu') if m.bias is not None: nn.init.constant_(m.bias, 0) # 初始化偏重为0 elif isinstance(m, nn.Linear): # 若是全连接层 nn.init.normal_(m.weight, 0, 0.01) # 正态分布初始化 nn.init.constant_(m.bias, 0) # 初始化偏重为0
4. train.py实现
train.py ——加载数据集并训练,训练集计算loss,测试集计算accuracy,保存训练好的网络参数
4.1 相关包的加载
import os import sys import json import torch import torch.nn as nn from torchvision import transforms, datasets, utils import matplotlib.pyplot as plt import numpy as np import torch.optim as optim from tqdm import tqdm from model import AlexNet
4.2 数据预处理
data_transform = { "train": transforms.Compose([transforms.RandomResizedCrop(224),# 随机裁剪,再缩放成 224×224 transforms.RandomHorizontalFlip(0.5), # 水平方向随机翻转,概率为 0.5, 即一半的概率翻转, 一半的概率不翻转 transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]), "val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224) transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}
4.3 加载训练集
但是这次的 花分类数据集并不在 pytorch 的 torchvision.datasets. 中,我们不能上篇LeNet网络搭建中是使用的torchvision.datasets.CIFAR10和torch.utils.data.DataLoader()来导入和加载数据集。需要用到datasets.ImageFolder()来导入。ImageFolder()返回的对象是一个包含数据集所有图像及对应标签构成的二维元组容器,支持索引和迭代,可作为torch.utils.data.DataLoader的输入。具体可参考:pytorch ImageFolder和Dataloader加载自制图像数据集
# 获取图像数据集的路径 data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path image_path = os.path.join(data_root, "data_set", "flower_data") # flower data set path assert os.path.exists(image_path), "{} path does not exist.".format(image_path) # 导入训练集并进行预处理 train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"), transform=data_transform["train"]) train_num = len(train_dataset) #为了方便在 predict 时读取信息,将 索引:标签 存入到一个 json 文件中 # 字典,类别:索引 {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4} flower_list = train_dataset.class_to_idx # 将 flower_list 中的 key 和 val 调换位置 cla_dict = dict((val, key) for key, val in flower_list.items()) # 将 cla_dict 写入 json 文件中 json_str = json.dumps(cla_dict, indent=4) with open('class_indices.json', 'w') as json_file: json_file.write(json_str) batch_size = 64 nw =0 # number of workers print('Using {} dataloader workers every process'.format(nw)) # 按batch_size分批次加载训练集 train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=nw)
4.4 加载验证集
validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"), transform=data_transform["val"]) val_num = len(validate_dataset) validate_loader = torch.utils.data.DataLoader(validate_dataset, batch_size=4, shuffle=False, num_workers=nw)
4.5 训练网络与验证网络
net = AlexNet(num_classes=5, init_weights=True)# 实例化网络(输出类型为5,初始化权重) net.to(device)# 分配网络到指定的设备(GPU/CPU)训练 loss_function = nn.CrossEntropyLoss()# 交叉熵损失 # pata = list(net.parameters()) optimizer = optim.Adam(net.parameters(), lr=0.0002)# 优化器(训练参数,学习率) epochs = 10 save_path = './AlexNet.pth' best_acc = 0.0 train_steps = len(train_loader) #训练集 for epoch in range(epochs): # train net.train()# 训练过程中开启 Dropout running_loss = 0.0 #每个 epoch 都会对 running_loss 清零 time_start = time.perf_counter() # 对训练一个 epoch 计时 train_bar = tqdm(train_loader, file=sys.stdout)# 对训练一个 epoch 计时 for step, data in enumerate(train_bar): # 遍历训练集,step从0开始计算 images, labels = data # 获取训练集的图像和标签 optimizer.zero_grad() # 清除历史梯度 outputs = net(images.to(device)) loss = loss_function(outputs, labels.to(device)) loss = loss.requires_grad_(True) loss.backward() optimizer.step() running_loss += loss.item() # 打印训练进度(使训练过程可视化) rate = (step + 1) / len(train_loader) # 当前进度 = 当前step / 训练一轮epoch所需总step a = "*" * int(rate * 50) b = "." * int((1 - rate) * 50) print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate * 100), a, b, loss), end="") print() print('%f s' % (time.perf_counter()-time_start)) # 验证集 net.eval()# 验证过程中关闭 Dropout acc = 0.0 # accumulate accurate number / epoch with torch.no_grad(): val_bar = tqdm(validate_loader, file=sys.stdout) for val_data in val_bar: val_images, val_labels = val_data outputs = net(val_images.to(device)) predict_y = torch.max(outputs, dim=1)[1]# 以output中值最大位置对应的索引(标签)作为预测输出 acc += torch.eq(predict_y, val_labels.to(device)).sum().item() val_accurate = acc / val_num print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' % (epoch + 1, running_loss / train_steps, val_accurate)) # 保存准确率最高的那次网络参数 if val_accurate > best_acc: best_acc = val_accurate torch.save(net.state_dict(), save_path) print('Finished Training') if __name__ == '__main__': main()
4.6 完整代码
import os import sys import json import torch import time import torch.nn as nn from torchvision import transforms, datasets, utils import matplotlib.pyplot as plt import numpy as np import torch.optim as optim from tqdm import tqdm from model import AlexNet def main(): device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print("using {} device.".format(device)) data_transform = { "train": transforms.Compose([transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]), "val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224) transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])} data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path image_path = os.path.join(data_root, "data_set", "flower_data") # flower data set path assert os.path.exists(image_path), "{} path does not exist.".format(image_path) train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"), transform=data_transform["train"]) train_num = len(train_dataset) # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4} flower_list = train_dataset.class_to_idx cla_dict = dict((val, key) for key, val in flower_list.items()) # write dict into json file json_str = json.dumps(cla_dict, indent=4) with open('class_indices.json', 'w') as json_file: json_file.write(json_str) batch_size = 64 nw =0 # number of workers print('Using {} dataloader workers every process'.format(nw)) train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=nw) validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"), transform=data_transform["val"]) val_num = len(validate_dataset) validate_loader = torch.utils.data.DataLoader(validate_dataset, batch_size=4, shuffle=False, num_workers=nw) print("using {} images for training, {} images for validation.".format(train_num, val_num)) # test_data_iter = iter(validate_loader) # test_image, test_label = test_data_iter.next() # # def imshow(img): # img = img / 2 + 0.5 # unnormalize # npimg = img.numpy() # plt.imshow(np.transpose(npimg, (1, 2, 0))) # plt.show() # # print(' '.join('%5s' % cla_dict[test_label[j].item()] for j in range(4))) # imshow(utils.make_grid(test_image)) net = AlexNet(num_classes=5, init_weights=True)# 实例化网络(输出类型为5,初始化权重) net.to(device)# 分配网络到指定的设备(GPU/CPU)训练 loss_function = nn.CrossEntropyLoss()# 交叉熵损失 # pata = list(net.parameters()) optimizer = optim.Adam(net.parameters(), lr=0.0002)# 优化器(训练参数,学习率) epochs = 10 save_path = './AlexNet.pth' best_acc = 0.0 train_steps = len(train_loader) #训练集 for epoch in range(epochs): # train net.train()# 训练过程中开启 Dropout running_loss = 0.0 #每个 epoch 都会对 running_loss 清零 time_start = time.perf_counter() # 对训练一个 epoch 计时 train_bar = tqdm(train_loader, file=sys.stdout)# 对训练一个 epoch 计时 for step, data in enumerate(train_bar): # 遍历训练集,step从0开始计算 images, labels = data # 获取训练集的图像和标签 optimizer.zero_grad() # 清除历史梯度 outputs = net(images.to(device)) loss = loss_function(outputs, labels.to(device)) loss = loss.requires_grad_(True) loss.backward() optimizer.step() running_loss += loss.item() # 打印训练进度(使训练过程可视化) rate = (step + 1) / len(train_loader) # 当前进度 = 当前step / 训练一轮epoch所需总step a = "*" * int(rate * 50) b = "." * int((1 - rate) * 50) print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate * 100), a, b, loss), end="") print() print('%f s' % (time.perf_counter()-time_start)) # validate net.eval()# 验证过程中关闭 Dropout acc = 0.0 # accumulate accurate number / epoch with torch.no_grad(): val_bar = tqdm(validate_loader, file=sys.stdout) for val_data in val_bar: val_images, val_labels = val_data outputs = net(val_images.to(device)) predict_y = torch.max(outputs, dim=1)[1]# 以output中值最大位置对应的索引(标签)作为预测输出 acc += torch.eq(predict_y, val_labels.to(device)).sum().item() val_accurate = acc / val_num print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' % (epoch + 1, running_loss / train_steps, val_accurate)) # 保存准确率最高的那次网络参数 if val_accurate > best_acc: best_acc = val_accurate torch.save(net.state_dict(), save_path) print('Finished Training') if __name__ == '__main__': main()
4. Bug解决
在训练中很多人会遇到:
**OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading “E:\Anaconda3\lib\site-packages\torch\lib\shm.dll” or one of its dependencies.**这样的错误
通常有一下三种方法:
1.重启pycharm
2.把num_works设置为0
3.调大页面文件的大小 + 更改一下batch_size
我使用的是第二种,因为我实在windows下面训练的,通常numworks设置为0。
如果在Lunix下面训练,可将numworks设置为
nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])
最后正常训练
5. predict.py实现
import os import json import torch from PIL import Image from torchvision import transforms import matplotlib.pyplot as plt from model import AlexNet def main(): device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") data_transform = transforms.Compose( [transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) # load image img_path = "../tulip.jpg" assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path) img = Image.open(img_path) plt.imshow(img) # [N, C, H, W] img = data_transform(img) # expand batch dimension img = torch.unsqueeze(img, dim=0) # read class_indict json_path = './class_indices.json' assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path) json_file = open(json_path, "r") class_indict = json.load(json_file) # create model model = AlexNet(num_classes=5).to(device) # load model weights weights_path = "./AlexNet.pth" assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path) model.load_state_dict(torch.load(weights_path)) # 关闭 Dropout model.eval() with torch.no_grad(): # predict class output = torch.squeeze(model(img.to(device))).cpu() predict = torch.softmax(output, dim=0) predict_cla = torch.argmax(predict).numpy() print_res = "class: {} prob: {:.3}".format(class_indict[str(predict_cla)], predict[predict_cla].numpy()) plt.title(print_res) for i in range(len(predict)): print("class: {:10} prob: {:.3}".format(class_indict[str(i)], predict[i].numpy())) plt.show() if __name__ == '__main__': main()
在网上下载花类图片,进行测试
使用Google cloab 免费GPU训练