（6CBIR模拟问题）自己动手，编写神经网络程序，解决Mnist问题，并网络化部署-阿里云开发者社区

一、CBIR技术简介

传统的图像检索过程，先通过人工对图像进行文字标注，再利用关键字来检索图像，这种依据图像描述的字符匹配程度提供检索结果的方法，简称“以字找图”，既耗时又主观多义。基于内容的图像检索客服“以字找图”方式的不足，直接从待查找的图像视觉特征出发，在图像库（查找范围）中找出与之相似的图像，这种依据视觉相似程度给出图像检索结果的方式，简称“以图找图”。基于内容的图像检索分为三个层次：

（1）依据提取图像本身的颜色、形状、纹理等低层特征进行检索；

（2）基于图像的低层特征，通过识别图像中的对象类别以及对象之间的空间拓扑关系进行检索；

（3）基于图像抽象属性（场景语义、行为语义、情感语义等）的推理学习进行检索；

基于内容的图像检索技术研究的热点可以分为4个方面：

最初的图像检索研究主要集中在如何选择合适的全局特征去描述图像内容和采用什么样的相似性度量方法进行图像匹配。

第二个研究热点是基于区域的图像检索方法，其主要思想是图像分割技术提取出图像中的物体，然后对每个区域使用局部特征来描述，综合每个区域特征可以得到图像的特征描述。这两个研究方向都是以图像为中心，对于用于的需求缺乏分析。

第三个研究热点就是针对这一问题而展开的，借助相关的反馈的思想，根据用户的需求及时调整系统检索时用的特征和相似性度量方法，从而缩小低层特征和高层语义之间的差距。

第四个研究热点是研究如何从多种渠道获取图像语义信息，如何将图像底层特征与图像关键词结合进行图像自动标注以提高检索准确率等。

根据一般图像检索的工作原理可以知道，基于特征的图像检索有3个关键：

（1）选取恰当的图像特征；

（2）采取有效的特征提取方法；

（3）准确的特征匹配算法；

利用各种特征对图像检索已经取得了相当的发展，大量的检索实验可以表明，综合特征检索要比单一特征检索更符合人类的视觉感受要求，检索效果会更好，但如何去找到合适的权值将多个特征组合起来是非常困难的。目前在CBIR中最常用的特征一般有：颜色特征、形状特征和纹理特征。

颜色是图像检索中最先被采用的特征，主要方法有：

（1）颜色直方图

（2）颜色一致性矢量（CCV，color coherence vectors）

（3）颜色相关图

（4）颜色矩

颜色矩是一种简单而有有效的颜色表示，它的数学基础是：任何图像的颜色分布都可以通过其各阶矩来表示。然而，颜色分布的大部分信息都集中在它的低阶矩上，所以可以用颜色的一阶矩（均值）、二阶矩（方差）和三阶矩（偏度）来近似估计图像的总体颜色分布。

颜色聚合矢量（CCV, Color Coherence Vector）是图像直方图的一种演变，其核心思想是当图像中颜色相似的像素所占据的连续区域的面积大于一定的阈值时，该区域中的像素为聚合像素，否则为非聚合像素。这样统计图像所包含的每种颜色的聚合像素和非聚合像素的比率称之为该图像的颜色聚合矢量，在图像检索过程中匹配目标图像的聚合矢量和检索图像的聚合矢量，聚合矢量中的聚合信息在某种程度上保留了图像颜色的空间信息。由于加入了空间信息，采用颜色聚合矢量CCV比采用颜色直方图检索的效果要好，特别是对于大块的均匀区域或者图像中大部分为纹理的图像检索效果要更好，但同时也增加的计算量。

其中，颜色直方图是最常用的，也是最基本的方法，但缺乏图像的空间信息；而CCV方法不仅统计了整幅图像中各颜色的像素值，还统计了图像中各颜色最大区域的像素值，效果较好，但CCV并没有强调各颜色最大区域的形状以及与背景的关系。因此，有人又考虑了图像的边缘信息，提出了CCV-TEV(threshold edge vector)算法；颜色相关图法强调同一颜色在图像中的空间距离相关性，其检索效果比上述几个方法都要好，但计算量比较大。颜色矩算法主要采用图像中各颜色的均值和方差作比较，处理简单，可以用它作为图像检索的初检，为下一步的细检缩小搜索范围。

形状是物体的一个重要特征，但由于物体形状的自动获取很困难，基于形状的检索一般仅限于非常容易识别的物体。形状可以用面积、离心率、圆形度、形状度、曲率、分形维等全局和局部特征来表示。其主要的分析方法有不变矩、Fourier描述符、自回归模型、向心链码、CSS(Curvature Scale Space), VSW(Variable Scale Wavelet)等；基于向心链码的方法即具有形状的编码能力又同时支持检索，它首先采用向心链码对形状进行编码，再在编码码流中直接提取形状的“相对凸数”及“凸度”，以此作为形状检索的依据。由于形状的向心链码具有旋转、平移、尺度的不变性，因此这种检索算法具有一定的抗“相似性形变”能力。向心链码形状检索的文章可以参考：黄祥林、宋磊、沈兰荪，一种基于向心链码的形状检索方法， 2001，信号采集与处理；

纹理是图像中一种重要而又难以描述的特征，航空、遥感图片、织物图案、复杂的自然风景以及动物植物等都含有纹理。通常来讲，把图像中局部不规则，而宏观有规律的特性称之为为了。以纹理特性为主导的图像称之为纹理图像，以纹理特征为主导的区域称之为纹理区域。纹理是图像的一个重要特征，一般认为纹理就是纹理元素有规律的排列组合，而把具有重复性、形状简单、强度一致的区域看做纹理元素。视觉纹理特征主要有：粗糙度、对比度、方向度、线象度、规整度、粗略度等。图像检索用到的纹理特征表示方法主要有：Tamura法（反映了粗糙度、对比度、方向度等）、MRSAR(multi-resolution simultaneous auto regressive model)、canny角直方图法、gabor法、塔式小波变换(PWT, pyramid wavelet transform)，树式小波变换（TWT, tree wavelet transform）等；这几种纹理特征方法的比较可以参考文献：Ma Weiying; Zhang Hongjiang, Benchmarking of image features for content based retrieval；The Thirty-Second Asilomar Conference on Signals, Systems & Computers, 1998

以上方法，都是在主动寻找图片的”特征“。随着DL技术的不断发展，这个问题的解决有了新思路；

二、 使用vgg16作为预训练的模型结构，并把它应用到手写数字识别上

      import numpy as np
from keras.datasets import mnist
import gc

from keras.models import Sequential, Model
from keras.layers import Input, Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.applications.vgg16 import VGG16
from keras.optimizers import SGD

import cv2
import h5py as h5py 
import numpy as np
def tran_y(y): 
    y_ohe = np.zeros(10) 
    y_ohe[y] = 1 
    return y_ohe


# 如果硬件配置较高，比如主机具备32GB以上内存，GPU具备8GB以上显存，可以适当增大这个值。VGG要求至少48像素
ishape=48
(X_train, y_train), (X_test, y_test) = mnist.load_data() 

X_train = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_train] 
X_train = np.concatenate([arr[np.newaxis] for arr in X_train]).astype('float32') 
X_train /= 255.0

X_test = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_test] 
X_test = np.concatenate([arr[np.newaxis] for arr in X_test]).astype('float32')
X_test /= 255.0

y_train_ohe = np.array([tran_y(y_train[i]) for i in range(len(y_train))]) 
y_test_ohe = np.array([tran_y(y_test[i]) for i in range(len(y_test))])
y_train_ohe = y_train_ohe.astype('float32')
y_test_ohe = y_test_ohe.astype('float32')


model_vgg = VGG16(include_top = False, weights = 'imagenet', input_shape = (ishape, ishape, 3)) 
#for i, layer in enumerate(model_vgg.layers): 
#    if i<20:
for layer in model_vgg.layers:
        layer.trainable = False
model = Flatten()(model_vgg.output) 
model = Dense(4096, activation='relu', name='fc1')(model)
model = Dense(4096, activation='relu', name='fc2')(model)
model = Dropout(0.5)(model)
model = Dense(10, activation = 'softmax', name='prediction')(model) 
model_vgg_mnist_pretrain = Model(model_vgg.input, model, name = 'vgg16_pretrain')
model_vgg_mnist_pretrain.summary()
sgd = SGD(lr = 0.05, decay = 1e-5) 
model_vgg_mnist_pretrain.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])
model_vgg_mnist_pretrain.fit(X_train, y_train_ohe, validation_data = (X_test, y_test_ohe), epochs = 10, batch_size = 64)
#del(model_vgg_mnist_pretrain, model_vgg, model)
for i in range(100):
    gc.collect()
    

结果为：

Test loss: 0.11260580219365657

Test accuracy: 0.9626

具有一定代表性，也就是采用训练迁移技术，套用了VGG网络。

三、解决模拟的CBIR问题（猫狗)

因为在系统使用上面还有较多限制，所以首先从解决模拟CBIR问题入手，探明道路。这里关键的一个问题，就是训练实际所用的数据是由我方提供的（上传GITHUB，在部署实机的时候再本地运行）

      import numpy as np
from keras.datasets import mnist
import gc

from keras.models import Sequential, Model
from keras.layers import Input, Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.applications.vgg16 import VGG16
from keras.optimizers import SGD
from keras.utils.data_utils import get_file
import cv2
import h5py as h5py 
import numpy as np

import os
import math
from matplotlib import pyplot as plt

#全局变量
RATIO = 0.2
train_dir = 'e:/template/dogvscat1K/'

#根据分类总数确定one-hot总类
NUM_DENSE = 2
#训练总数
epochs = 10

def tran_y(y): 
    y_ohe = np.zeros(NUM_DENSE) 
    y_ohe[y] = 1 
    return y_ohe

#根据Ratio获得训练和测试数据集的图片地址和标签
#https://github.com/jsxyhelu/GOCW/raw/master/dogvscat.npz
def get_files(file_dir, ratio):
    '''
    Args:
        file_dir: file directory
    Returns:
        list of images and labels
    '''
    cats = []
    label_cats = []
    dogs = []
    label_dogs = []
    for file in os.listdir(file_dir):
        name = file.split(sep='.')
        if name[0]=='cat':
            cats.append(file_dir + file)
            label_cats.append(0)
        else:
            dogs.append(file_dir + file)
            label_dogs.append(1)
    print('数据集中有 %d cats\n以及 %d dogs' %(len(cats), len(dogs)))
    #图片list和标签list
    #hstack 水平(按列顺序)把数组给堆叠起来
    image_list = np.hstack((cats, dogs))
    label_list = np.hstack((label_cats, label_dogs))
    
    temp = np.array([image_list, label_list])
    temp = temp.transpose()
    np.random.shuffle(temp)   
    
    all_image_list = temp[:, 0]
    all_label_list = temp[:, 1]
    
    n_sample = len(all_label_list)
    #根据比率，确定训练和测试数量
    n_val = math.ceil(n_sample*ratio) # number of validation samples
    n_train = n_sample - n_val # number of trainning samples
    tra_images = []
    val_images = []
    #按照0-n_train为tra_images，后面位val_images的方式来排序
    
    for index in range(n_train):
        image = cv2.imread(all_image_list[index])
        #灰度，然后缩放
        image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
        image = cv2.resize(image,(48,48))#到底在这个地方修改，还是在后面修改，需要做具体实验
        tra_images.append(image)

    tra_labels = all_label_list[:n_train]
    tra_labels = [int(float(i)) for i in tra_labels]

    for index in range(n_val):
        image = cv2.imread(all_image_list[n_train+index])
        #灰度，然后缩放
        image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
        image = cv2.resize(image,(32,32))
        val_images.append(image)

    val_labels = all_label_list[n_train:]
    val_labels = [int(float(i)) for i in val_labels]
    return np.array(tra_images),np.array(tra_labels),np.array(val_images),np.array(val_labels)

# colab+VGG要求至少48像素在现有数据集上，已经能够完成不错情况
ishape=48
#(X_train, y_train), (X_test, y_test) = mnist.load_data() 
#获得数据集
#X_train, y_train, X_test, y_test = get_files(train_dir, RATIO)
#保持数据
#np.savez("D:\\dl4cv\\datesets\\dogvscat_NPY\\dogvscat.npz",X_train=X_train,y_train=y_train,X_test=X_test,y_test=y_test)
#读取数据
path='dogvscat.npz'
path = get_file(path,origin='https://github.com/jsxyhelu/GOCW/raw/master/dogvscat.npz')
f = np.load(path)
X_train, y_train = f['X_train'], f['y_train']
X_test, y_test = f['X_test'], f['y_test']


X_train = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_train] 
X_train = np.concatenate([arr[np.newaxis] for arr in X_train]).astype('float32') 
X_train /= 255.0

X_test = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_test] 
X_test = np.concatenate([arr[np.newaxis] for arr in X_test]).astype('float32')
X_test /= 255.0

y_train_ohe = np.array([tran_y(y_train[i]) for i in range(len(y_train))]) 
y_test_ohe = np.array([tran_y(y_test[i]) for i in range(len(y_test))])
y_train_ohe = y_train_ohe.astype('float32')
y_test_ohe = y_test_ohe.astype('float32')


model_vgg = VGG16(include_top = False, weights = 'imagenet', input_shape = (ishape, ishape, 3)) 
#for i, layer in enumerate(model_vgg.layers): 
#    if i<20:
for layer in model_vgg.layers:
        layer.trainable = False
model = Flatten()(model_vgg.output) 
model = Dense(4096, activation='relu', name='fc1')(model)
model = Dense(4096, activation='relu', name='fc2')(model)
model = Dropout(0.5)(model)
model = Dense(NUM_DENSE, activation = 'softmax', name='prediction')(model) 
model_vgg_pretrain = Model(model_vgg.input, model, name = 'vgg16_pretrain')
#model_vgg_pretrain.summary()
print("vgg准备完毕\n")
sgd = SGD(lr = 0.05, decay = 1e-5) 
model_vgg_pretrain.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])
print("vgg开始训练\n")
log = model_vgg_pretrain.fit(X_train, y_train_ohe, validation_data = (X_test, y_test_ohe), epochs = epochs, batch_size = 64)

score = model_vgg_pretrain.evaluate(X_test, y_test_ohe, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

plt.figure('acc')  
plt.subplot(2, 1, 1)  
plt.plot(log.history['acc'],'r--',label='Training Accuracy')  
plt.plot(log.history['val_acc'],'r-',label='Validation Accuracy')  
plt.legend(loc='best')  
plt.xlabel('Epochs')  
plt.axis([0, epochs, 0.5, 1])  
plt.figure('loss')  
plt.subplot(2, 1, 2)  
plt.plot(log.history['loss'],'b--',label='Training Loss')  
plt.plot(log.history['val_loss'],'b-',label='Validation Loss')  
plt.legend(loc='best')  
plt.xlabel('Epochs')  
plt.axis([0, epochs, 0, 1])  
  
plt.show() 
os.system("pause")
    

Using TensorFlow backend.

Downloading data from https://github.com/jsxyhelu/GOCW/raw/master/dogvscat.npz

4112384/4104922 [==============================] - 0s 0us/step

Downloading data from

https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

58892288/58889256 [==============================] - 2s 0us/step

vgg准备完毕

vgg开始训练

WARNING:tensorflow:Variable *= will be deprecated. Use variable.assign_mul if you want assignment to the variable value or 'x = x * y' if you want a new python Tensor object.

Train on 1600 samples, validate on 400 samples

Epoch 1/10

1600/1600 [==============================] - 3s 2ms/step - loss: 0.8348 - acc: 0.5331 - val_loss: 0.6337 - val_acc: 0.6600

Epoch 2/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.6191 - acc: 0.6794 - val_loss: 0.6068 - val_acc: 0.6975

Epoch 3/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.6611 - acc: 0.6200 - val_loss: 0.6218 - val_acc: 0.6400

Epoch 4/10

1472/1600 [==========================>...] - ETA: 0s - loss: 0.6121 - acc: 0.6488

1600/1600 [==============================] - 2s 1ms/step - loss: 0.6118 - acc: 0.6487 - val_loss: 0.6949 - val_acc: 0.5450

Epoch 5/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5892 - acc: 0.6881 - val_loss: 0.6611 - val_acc: 0.6175

Epoch 6/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5973 - acc: 0.6656 - val_loss: 0.6095 - val_acc: 0.6600

Epoch 7/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5959 - acc: 0.6669 - val_loss: 0.6365 - val_acc: 0.6075

Epoch 8/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5841 - acc: 0.6769 - val_loss: 0.5960 - val_acc: 0.6475

Epoch 9/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5498 - acc: 0.7188 - val_loss: 0.6480 - val_acc: 0.6075

Epoch 10/10

1600/1600 [==============================] - 2s 1ms/step - loss: 0.5645 - acc: 0.6994 - val_loss: 0.6293 - val_acc: 0.6275

Test loss: 0.629267041683197

Test accuracy: 0.6275

32512

这个问题还是比较严重的，准确率太低了。但是数据上传应该是没有问题的，可能是数据准备方面的问题。

四、解决模拟的CBIR问题(5图分类）

和猫狗非常类似，但是5类的图形

       import numpy as np
from keras.datasets import mnist
import gc

from keras.models import Sequential, Model
from keras.layers import Input, Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.applications.vgg16 import VGG16
from keras.optimizers import SGD
from keras.utils.data_utils import get_file
import cv2
import h5py as h5py 
import numpy as np

import os
import math
from matplotlib import pyplot as plt

#全局变量
RATIO = 0.2
train_dir = 'D:/dl4cv/datesets/littleCBIR/'

#根据分类总数确定one-hot总类
NUM_DENSE = 5
#训练总数
epochs = 10

def tran_y(y): 
    y_ohe = np.zeros(NUM_DENSE) 
    y_ohe[y] = 1 
    return y_ohe

#根据Ratio获得训练和测试数据集的图片地址和标签
##生成数据集,本例先验3**汽车、4**恐龙、5**大象、6**花、7**马
def get_files(file_dir, ratio):
    '''
    Args:
        file_dir: file directory
    Returns:
        list of images and labels
    '''
    image_list = []
    label_list = []
    for file in os.listdir(file_dir):
        if file[0:1]=='3':
            image_list.append(file_dir + file)
            label_list.append(0)
        elif file[0:1]=='4':
            image_list.append(file_dir + file)
            label_list.append(1)
        elif file[0:1]=='5':
            image_list.append(file_dir + file)
            label_list.append(2)
        elif file[0:1]=='6':
            image_list.append(file_dir + file)
            label_list.append(3)
        else:
            image_list.append(file_dir + file)
            label_list.append(4)
    print('数据集导入完毕')
    #图片list和标签list
    #hstack 水平(按列顺序)把数组给堆叠起来
    image_list = np.hstack(image_list)
    label_list = np.hstack(label_list)
    
    temp = np.array([image_list, label_list])
    temp = temp.transpose()
    np.random.shuffle(temp)   
    
    all_image_list = temp[:, 0]
    all_label_list = temp[:, 1]
    
    n_sample = len(all_label_list)
    #根据比率，确定训练和测试数量
    n_val = math.ceil(n_sample*ratio) # number of validation samples
    n_train = n_sample - n_val # number of trainning samples
    tra_images = []
    val_images = []
    #按照0-n_train为tra_images，后面位val_images的方式来排序
    
    for index in range(n_train):
        image = cv2.imread(all_image_list[index])
        #灰度，然后缩放
        image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
        image = cv2.resize(image,(48,48))#到底在这个地方修改，还是在后面修改，需要做具体实验
        tra_images.append(image)

    tra_labels = all_label_list[:n_train]
    tra_labels = [int(float(i)) for i in tra_labels]

    for index in range(n_val):
        image = cv2.imread(all_image_list[n_train+index])
        #灰度，然后缩放
        image = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)
        image = cv2.resize(image,(32,32))
        val_images.append(image)

    val_labels = all_label_list[n_train:]
    val_labels = [int(float(i)) for i in val_labels]
    return np.array(tra_images),np.array(tra_labels),np.array(val_images),np.array(val_labels)

# colab+VGG要求至少48像素在现有数据集上，已经能够完成不错情况
ishape=48
#(X_train, y_train), (X_test, y_test) = mnist.load_data() 
#获得数据集
#X_train, y_train, X_test, y_test = get_files(train_dir, RATIO)
#保持数据
##np.savez("D:\\dl4cv\\datesets\\littleCBIR.npz",X_train=X_train,y_train=y_train,X_test=X_test,y_test=y_test)
#读取数据
 
path='littleCBIR.npz'
#https://github.com/jsxyhelu/GOCW/raw/master/littleCBIR.npz
path = get_file(path,origin='https://github.com/jsxyhelu/GOCW/raw/master/littleCBIR.npz')
f = np.load(path)
X_train, y_train = f['X_train'], f['y_train']
X_test, y_test = f['X_test'], f['y_test']


X_train = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_train] 
X_train = np.concatenate([arr[np.newaxis] for arr in X_train]).astype('float32') 
X_train /= 255.0

X_test = [cv2.cvtColor(cv2.resize(i, (ishape, ishape)), cv2.COLOR_GRAY2BGR) for i in X_test] 
X_test = np.concatenate([arr[np.newaxis] for arr in X_test]).astype('float32')
X_test /= 255.0

y_train_ohe = np.array([tran_y(y_train[i]) for i in range(len(y_train))]) 
y_test_ohe = np.array([tran_y(y_test[i]) for i in range(len(y_test))])
y_train_ohe = y_train_ohe.astype('float32')
y_test_ohe = y_test_ohe.astype('float32')


model_vgg = VGG16(include_top = False, weights = 'imagenet', input_shape = (ishape, ishape, 3)) 
#for i, layer in enumerate(model_vgg.layers): 
#    if i<20:
for layer in model_vgg.layers:
        layer.trainable = False
model = Flatten()(model_vgg.output) 
model = Dense(4096, activation='relu', name='fc1')(model)
model = Dense(4096, activation='relu', name='fc2')(model)
model = Dropout(0.5)(model)
model = Dense(NUM_DENSE, activation = 'softmax', name='prediction')(model) 
model_vgg_pretrain = Model(model_vgg.input, model, name = 'vgg16_pretrain')
#model_vgg_pretrain.summary()
print("vgg准备完毕\n")
sgd = SGD(lr = 0.05, decay = 1e-5) 
model_vgg_pretrain.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])
print("vgg开始训练\n")
log = model_vgg_pretrain.fit(X_train, y_train_ohe, validation_data = (X_test, y_test_ohe), epochs = epochs, batch_size = 64)

score = model_vgg_pretrain.evaluate(X_test, y_test_ohe, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

plt.figure('acc')  
plt.subplot(2, 1, 1)  
plt.plot(log.history['acc'],'r--',label='Training Accuracy')  
plt.plot(log.history['val_acc'],'r-',label='Validation Accuracy')  
plt.legend(loc='best')  
plt.xlabel('Epochs')  
plt.axis([0, epochs, 0.5, 1])  
plt.figure('loss')  
plt.subplot(2, 1, 2)  
plt.plot(log.history['loss'],'b--',label='Training Loss')  
plt.plot(log.history['val_loss'],'b-',label='Validation Loss')  
plt.legend(loc='best')  
plt.xlabel('Epochs')  
plt.axis([0, epochs, 0, 1])  
  
plt.show() 
os.system("pause")
     

vgg准备完毕

vgg开始训练

Train on 400 samples, validate on 100 samples

Epoch 1/10

400/400 [==============================] - 1s 2ms/step - loss: 1.5373 - acc: 0.3700 - val_loss: 1.4409 - val_acc: 0.2700

Epoch 2/10

400/400 [==============================] - 0s 1ms/step - loss: 0.9020 - acc: 0.7150 - val_loss: 1.1492 - val_acc: 0.4600

Epoch 3/10

400/400 [==============================] - 0s 1ms/step - loss: 0.6484 - acc: 0.7975 - val_loss: 0.8033 - val_acc: 0.7600

Epoch 4/10

400/400 [==============================] - 0s 1ms/step - loss: 0.4853 - acc: 0.8675 - val_loss: 0.9245 - val_acc: 0.6200

Epoch 5/10

400/400 [==============================] - 0s 1ms/step - loss: 0.5074 - acc: 0.8350 - val_loss: 0.7115 - val_acc: 0.7100

Epoch 6/10

400/400 [==============================] - 1s 1ms/step - loss: 0.4721 - acc: 0.8125 - val_loss: 0.7583 - val_acc: 0.7200

Epoch 7/10

400/400 [==============================] - 0s 1ms/step - loss: 0.3700 - acc: 0.8875 - val_loss: 0.6153 - val_acc: 0.7900

Epoch 8/10

400/400 [==============================] - 0s 1ms/step - loss: 0.3849 - acc: 0.8625 - val_loss: 0.5941 - val_acc: 0.8100

Epoch 9/10

400/400 [==============================] - 0s 1ms/step - loss: 0.3253 - acc: 0.8900 - val_loss: 1.2803 - val_acc: 0.5700

Epoch 10/10

400/400 [==============================] - 1s 1ms/step - loss: 0.4965 - acc: 0.8100 - val_loss: 0.5930 - val_acc: 0.8300

Test loss: 0.5930496269464492

Test accuracy: 0.83

ezcOFCrl69ytixYwHXimRjxoxBq9Wi0WgyVh0aNm

我已经开始具体涉及参数调节问题了。算法体现出较大颠簸。特别是epoch=8的时候那个是什么鬼？特别要注意我现在解决的是一个“学习迁移”问题，这也就说明参数的调节非常重要。当我训练和测试更多的时候：颠簸和值的丢失。

Test loss: 1.254318968951702

Test accuracy: 0.68

PBz09PTuXr1Kv379yc8PBy73U5BQUENn42IaOcsU

修正

5aoQQMnMW4hFh6rZ2s2bNAMjIyGDz5s0UFBSwZMk

W7dupWffvqJrl27Ehsbi5+fH7m5uQ18NiKinrPIT

原始图像直接处理，效果最好。

Test loss: 0.2156761786714196

Test accuracy: 0.93

五、参数调节

这肯定是不可以的，要想办法解决：

1、是看一看现在数据的准备是否是有问题的，但是它训练这块的还好呀；

2、那就是过拟合了，如何去除过拟合了？

最后，当我对原始图片进行训练的时候，得到目前最好的结果，在0.93左右。同时可以发现曲线抖动的比较厉害，我将其认为是数据集比较小的原因。

来自为知笔记(Wiz)

目前方向：图像拼接融合、图像识别联系方式：jsxyhelu@foxmail.com

（6CBIR模拟问题）自己动手，编写神经网络程序，解决Mnist问题，并网络化部署

热门文章

最新文章

相关课程

相关电子书

相关实验场景