使用纯Python构建神经网络实现鸢尾花数据集分类-开发者社区-阿里云

纯Python实现鸢尾属植物数据集神经网络模型

2018-07-19 22162

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 本文以Python代码完成整个鸾尾花图像分类任务，没有调用任何的数据包，适合新手阅读理解，并动手实践体验下机器学习方法的大致流程。

尝试使用过各大公司推出的植物识别APP吗？比如微软识花、花伴侣等这些APP。当你看到一朵不知道学名的花时，只需要打开植物识别APP，拍摄一张你所想辨认的植物照片并上传，APP会自动识别出该花的品种及详细介绍，感觉手机中装了一个知识渊博的生物学家，是不是很神奇？其实，背后的原理很简单，是一个图像分类的过程，将上传的图像与手机中预存的数据集或联网数据进行匹配，将其分类到对应的类别即可。随着深度学习方法的应用，图像分类的精度越来越高，在部分数据集上已经超越了人眼的能力。
相对于传统神经网络的方法而言，深度学习方法一般对数据集规模、硬件平台有着比较高的要求，如果只是单纯的想尝试了解图像分类任务的基本流程，建议采用小数据集样本及传统的神经网络方法实现。本文将带领读者采用鸢尾属植物数据集（Iris Data Set）来实现一个分类任务，整个鸢尾属植物数据集是机器学习中历史悠久的数据集，比现在常用的数字手写体数据集（Mnist Data Set）数据集还要早得多，该数据集来源于英国著名的统计学家、生物学家Ronald Fiser。本文在不使用相关软件库的情况下，从头开始构建针对鸢尾属植物数据的神经网络模型，对其进行训练并获得好的结果。

鸢尾属植物数据集是用于测试机器学习算法的最常用数据集。该数据包含四种特征，萼片长度、萼片宽度、花瓣长度和花瓣宽度，用于鸢尾属植物的不同物种（ versicolor, virginica和 setosa）。此外，每个物种有50个实例（数据行），下面让我们看看样本数据分布情况。

我们将在这个数据集上使用神经网络构建分类模型。为了简单起见，使用花瓣长度和花瓣宽度作为特征，且只有两类物种：versicolor和virginica。下面就让我们在Python中逐步训练针对该样本数据集的神经网络：

步骤1：准备鸢尾属植物数据集

将Iris数据集导入python并对数据进行子集划分以保留行之间的相关性：

#import libraries
import os
import pandas as pd
#Set working directory and load data
os.chdir('C:\\Users\\rohan\\Documents\\Analytics\\Data')
iris = pd.read_csv('iris.csv')
#Create numeric classes for species (0,1,2) 
iris.loc[iris['Name']=='virginica','species']=0
iris.loc[iris['Name']=='versicolor','species']=1
iris.loc[iris['Name']=='setosa','species'] = 2
iris = iris[iris['species']!=2]
#Create Input and Output columns
X = iris[['PetalLength', 'PetalWidth']].values.T
Y = iris[['species']].values.T
Y = Y.astype('uint8')
#Make a scatter plot
plt.scatter(X[0, :], X[1, :], c=Y[0,:], s=40, cmap=plt.cm.Spectral);
plt.title("IRIS DATA | Blue - Versicolor, Red - Virginica ")
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
plt.show()

蓝色点代表 Versicolor物种，红色点代表 Virginica物种。本文构建的神经网络将在这些数据上进行训练，以期最后能正确地分类物种。

步骤2：初始化参数（权重和偏置）

下面构建一个具有单个隐藏层的神经网络。此外，将隐藏图层的大小设置为6：

def initialize_parameters(n_x, n_h, n_y):
    
    np.random.seed(2) # we set up a seed so that our output matches ours although the initialization is random.
    
    W1 = np.random.randn(n_h, n_x) * 0.01 #weight matrix of shape (n_h, n_x)
    b1 = np.zeros(shape=(n_h, 1))  #bias vector of shape (n_h, 1)
    W2 = np.random.randn(n_y, n_h) * 0.01   #weight matrix of shape (n_y, n_h)
    b2 = np.zeros(shape=(n_y, 1))  #bias vector of shape (n_y, 1)
       
    #store parameters into a dictionary    
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    
    return parameters

步骤3：前向传播（forward propagation）

在前向传播过程中，使用tanh激活函数作为第一层的激活函数，使用sigmoid激活函数作为第二层的激活函数：

def forward_propagation(X, parameters):
#retrieve intialized parameters from dictionary    
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    
    
    # Implement Forward Propagation to calculate A2 (probability)
    Z1 = np.dot(W1, X) + b1
    A1 = np.tanh(Z1)  #tanh activation function
    Z2 = np.dot(W2, A1) + b2
    A2 = 1/(1+np.exp(-Z2))  #sigmoid activation function
    
    cache = {"Z1": Z1,
             "A1": A1,
             "Z2": Z2,
             "A2": A2}
    
    return A2, cache

步骤4：计算代价函数（cost function）

目标是使得计算的代价函数小化，本文采用交叉熵（cross-entropy）作为代价函数：

def compute_cost(A2, Y, parameters):
   
    m = Y.shape[1] # number of training examples
    
    # Retrieve W1 and W2 from parameters
    W1 = parameters['W1']
    W2 = parameters['W2']
    
    # Compute the cross-entropy cost
    logprobs = np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2))
    cost = - np.sum(logprobs) / m
    
    return cost

步骤5：反向传播（back propagation）

计算反向传播过程，主要是计算代价函数的导数：

def backward_propagation(parameters, cache, X, Y):
# Number of training examples
    m = X.shape[1]
    
    # First, retrieve W1 and W2 from the dictionary "parameters".
W1 = parameters['W1']
    W2 = parameters['W2']
    ### END CODE HERE ###
        
    # Retrieve A1 and A2 from dictionary "cache".
    A1 = cache['A1']
    A2 = cache['A2']
    
    # Backward propagation: calculate dW1, db1, dW2, db2. 
    dZ2= A2 - Y
    dW2 = (1 / m) * np.dot(dZ2, A1.T)
    db2 = (1 / m) * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = np.multiply(np.dot(W2.T, dZ2), 1 - np.power(A1, 2))
    dW1 = (1 / m) * np.dot(dZ1, X.T)
    db1 = (1 / m) * np.sum(dZ1, axis=1, keepdims=True)
grads = {"dW1": dW1,
             "db1": db1,
             "dW2": dW2,
             "db2": db2}
    
    return grads

步骤6：更新参数

使用反向传播过程中计算的梯度来更新权重和偏置：

def update_parameters(parameters, grads, learning_rate=1.2):
# Retrieve each parameter from the dictionary "parameters"
W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    
    # Retrieve each gradient from the dictionary "grads"
    dW1 = grads['dW1']
    db1 = grads['db1']
    dW2 = grads['dW2']
    db2 = grads['db2']
    
    # Update rule for each parameter
    W1 = W1 - learning_rate * dW1
    b1 = b1 - learning_rate * db1
    W2 = W2 - learning_rate * dW2
    b2 = b2 - learning_rate * db2
    
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    
    return parameters

步骤7：建立神经网络

将以上所有函数组合起来以创建设计的神经网络模型。总而言之，下面是模型函数的整体顺序：

初始化参数
前向传播
计算代价函数
反向传播
更新参数

def nn_model(X, Y, n_h, num_iterations=10000, print_cost=False):
np.random.seed(3)
    n_x = layer_sizes(X, Y)[0]
    n_y = layer_sizes(X, Y)[2]
    
    # Initialize parameters, then retrieve W1, b1, W2, b2. Inputs: "n_x, n_h, n_y". Outputs = "W1, b1, W2, b2, parameters".
parameters = initialize_parameters(n_x, n_h, n_y)
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    
    # Loop (gradient descent)
for i in range(0, num_iterations):
         
        # Forward propagation. Inputs: "X, parameters". Outputs: "A2, cache".
        A2, cache = forward_propagation(X, parameters)
        
        # Cost function. Inputs: "A2, Y, parameters". Outputs: "cost".
        cost = compute_cost(A2, Y, parameters)
 
        # Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads".
        grads = backward_propagation(parameters, cache, X, Y)
 
        # Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters".
        parameters = update_parameters(parameters, grads)
        
        ### END CODE HERE ###
        
        # Print the cost every 1000 iterations
        if print_cost and i % 1000 == 0:
            print ("Cost after iteration %i: %f" % (i, cost))
return parameters,n_h

步骤8：跑动模型

将隐藏层节点设置为6，最大迭代次数设置为10,000次，并每隔1000次打印出训练的结果：

parameters = nn_model(X,Y , n_h = 6, num_iterations=10000, print_cost=True)

步骤9：画出分类边界

def plot_decision_boundary(model, X, y):
    # Set min and max values and give it some padding
    x_min, x_max = X[0, :].min() - 0.25, X[0, :].max() + 0.25
    y_min, y_max = X[1, :].min() - 0.25, X[1, :].max() + 0.25
    h = 0.01
    # Generate a grid of points with distance h between them
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    # Predict the function value for the whole grid
    Z = model(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    # Plot the contour and training examples
    plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral)
    plt.ylabel('x2')
    plt.xlabel('x1')
    plt.scatter(X[0, :], X[1, :], c=y, cmap=plt.cm.Spectral)
plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y[0,:])
plt.title("Decision Boundary for hidden layer size " + str(6))
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')

从图中可以观察到，只有四个点被错误分类。虽然我们可以调整模型来进一步地提高模型训练精度，但该些操作显然会导致过拟合现象的出现。

资源

https://www.coursera.org/specializations/deep-learning

数十款阿里云产品限时折扣中，赶紧点击领劵开始云上实践吧！

作者信息

Rohan Joseph，数据科学家
个人主页：https://www.linkedin.com/in/rohan-joseph-b39a86aa/
本文由阿里云云栖社区组织翻译。
文章原标题《Neural network on iris data》，译者：海棠，审校：Uncle_LLD。
文章为简译，更为详细的内容，请查看原文。

纯Python实现鸢尾属植物数据集神经网络模型

步骤1：准备鸢尾属植物数据集

步骤2：初始化参数（权重和偏置）

步骤3：前向传播（forward propagation）

步骤4：计算代价函数（cost function）

步骤5：反向传播（back propagation）

步骤6：更新参数

步骤7：建立神经网络

步骤8：跑动模型

步骤9：画出分类边界

资源

数十款阿里云产品限时折扣中，赶紧点击领劵开始云上实践吧！

作者信息

热门文章

最新文章

相关课程

相关电子书

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

纯Python实现鸢尾属植物数据集神经网络模型

步骤1：准备鸢尾属植物数据集

步骤2：初始化参数（权重和偏置）

步骤3：前向传播（forward propagation）

步骤4：计算代价函数（cost function）

步骤5：反向传播（back propagation）

步骤6：更新参数

步骤7：建立神经网络

步骤8：跑动模型

步骤9：画出分类边界

资源

数十款阿里云产品限时折扣中，赶紧点击领劵开始云上实践吧！

作者信息

热门文章

最新文章

相关课程

相关电子书

推荐镜像