13神经网络
13.1 历史
年代 |
英文名 |
中文名 |
发明人 |
1943年 |
M-P (McCulloch-Pitts neuron,MCP) |
脑神经元的抽象模型 |
美国神经解剖学家沃伦麦克洛奇(Warren McCulloch) 神经网络和数学家沃尔特皮茨(Walter Pitts) |
1958 年 |
第一个感知机学习法则 |
计算机科学家弗兰克罗森布拉特(Frank Rossenblatt) |
|
1969 年 |
出版preceptron的一书 |
感知机的弱点 |
计算机领域的大牛马文明斯基(Marvin Minsky) |
1979年 |
Back propagation,BP |
反向传播算法 |
杰弗瑞欣顿(Geoffery Hinton)" 神经网络之父 " |
Multilayer Preceptron,MLP |
多层感知器 |
13.2人工智能、机器学习与深度学习的关系
13.3 神经网络概念
脑神经元的抽象模型,被称为M-P 模型(McCulloch-Pitts neuron,MCP):1943年,美国神经解剖学家沃伦麦克洛奇(Warren McCulloch)神经网络和数学家沃尔特皮茨(Walter Pitts)。
兴奋:1
抑制:0
人脑神经活动4个步骤:
- 外部刺激通过神经末梢转为电信号,传到神经元细胞
- 无数神经元构成神经中枢
- 神经中枢综合各种信号做出判断
- 把神经中枢的指令发送到人体各个部分,对外部刺激做出反应
线性模型的一般公式:
= wk1⋅x1 + wk2⋅x2 + ⋯ + wkn⋅xn + bk
隐藏层的个数叫深度
隐藏层1:
h11 = x1+ x2+ x3+ x4
h12 = x1+ x2+ x3+ x4
h13 = x1+ x2+ x3+ x4
隐藏层2:
h21 = h11+ h12+ h13
h22 = h11+ h12+ h13
h23 = h11+ h12+ h13
输出:
y = h21+h22+h23
13.4 矫正方法(Rectifying)
- identity :对样本不做处理,f(x)=x
- logistic (回归分析) : f(x)=1/[1+exp(−x)]
- tanh (双曲正切处理) : tangent hyperbolic)
- relu (非线性矫正) :rectified linear unit) 默认
tanh:把值压缩到[-1.1]
relu: 把小于0的x去掉,用0代替
13.5 神经网络分类
13.5.1类、参数、属性和方法
类
class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=100, activation='relu', *, solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08, n_iter_no_change=10, max_fun=15000)
参数
参数 |
解释 |
hidden_layer_sizes |
tuple, length = n_layers - 2, default=(100,) ith元素表示ith隐藏层中的神经元数量。 |
solver |
{'lbfgs', 'sgd', 'adam'}, default='adam'重量优化求解器。 'lbfgs'是拟牛顿方法家族中的优化器。 'sgd'指随机梯度下降。 'adam'指的是由金马、迪德里克和吉米巴提出的基于梯度的随机优化器注意:就训练时间和验证分数而言,默认解算器'adam'在相对较大的数据集(有数千个或更多的训练样本)上工作得相当好。然而,对于小数据集,'lbfgs'可以更快地收敛,性能更好。 |
alpha |
float, default=0.0001。L2惩罚(正则项)参数。 |
activation |
{'identity', 'logistic', 'tanh', 'relu'}, default='relu'隐藏层的激活功能。 'identity',无操作激活,用于实现线性瓶颈,返回f(x) = x 'logistic',即logistic sigmoid函数,返回f(x) = 1 / (1 + exp(-x))。 'tanh',双曲tan函数,返回f(x) = tanh(x)。 'relu'是已校正的线性单位函数,返回f(x) = max(0,x) |
属性
属性 |
类别 |
介绍 |
classes_ |
ndarray or list of ndarray of shape (n_classes,) |
每个输出的类别标签。 |
loss_ |
float |
用损失函数计算的当前损失。 |
best_loss_ |
float |
求解器在整个拟合过程中达到的最小损失。 |
loss_curve_ |
list of shape (n_iter_,) |
列表中的ith元素表示第ith次迭代的损失。 |
t_ |
int |
求解器在拟合过程中看到的训练样本数。 |
coefs_ |
list of shape (n_layers - 1,) |
列表中的ith元素表示对应于层I的权重矩阵。 |
intercepts_ |
list of shape (n_layers - 1,) |
列表中的ith元素表示对应于层i + 1的偏置向量。 |
n_iter_ |
int |
求解器已运行的迭代次数。 |
n_layers_ |
int |
层数。 |
n_outputs_ |
int |
输出数量。 |
out_activation_ |
str |
输出激活函数的名称。 |
方法
fit(X, y) |
将模型与数据矩阵X和目标y相匹配。 |
get_params([deep]) |
获取此估计器的参数。 |
predict(X) |
使用多层感知器模型进行预测。 |
score(X, y[, sample_weight]) |
返回预测的决定系数R2。 |
set_params(**params) |
设置此估计器的参数。 |
12.5.2神经网络分类算法
def My_MLPClassifier(solver,hidden_layer_sizes,activation,level,alpha,mydata,title): warnings.filterwarnings("ignore") myutil = util() X,y = mydata.data,mydata.target X1 = X[:,:2] X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=0) mlp = MLPClassifier(solver=solver,hidden_layer_sizes=hidden_layer_sizes,activation=activation,alpha=alpha,max_iter=10000) mlp.fit(X_train,y_train) mytitle = "MLPClassifier("+title+"):solver:"+solver+",node:"+str(hidden_layer_sizes)+",activation:"+activation+",level="+str(level)+",alpha="+str(alpha) myutil.print_scores(mlp,X_train,y_train,X_test,y_test,mytitle) myutil.plot_learning_curve(MLPClassifier(solver=solver,hidden_layer_sizes=hidden_layer_sizes,activation=activation,alpha=alpha,max_iter=10000),X,y,mytitle) myutil.show_pic(mytitle) mlp = MLPClassifier(solver=solver,hidden_layer_sizes=hidden_layer_sizes,activation=activation,alpha=alpha,max_iter=10000).fit(X1,y) myutil.draw_scatter_for_clf(X1,y,mlp,mytitle) def MLPClassifier_base(): mydatas = [datasets.load_iris(), datasets.load_wine(), datasets.load_breast_cancer()] titles = ["鸢尾花数据","红酒数据","乳腺癌数据"] for (mydata,title) in zip(mydatas, titles): ten = [10] hundred = [100] two_ten = [10,10] Parameters = [['lbfgs',hundred,'relu',1,0.0001],['lbfgs',ten,'relu',1,0.0001],['lbfgs',two_ten,'relu',2,0.0001],['lbfgs',two_ten,'tanh',2,0.0001],['lbfgs',two_ten,'tanh',2,1]] for Parameter in Parameters: My_MLPClassifier(Parameter[0],Parameter[1],Parameter[2],Parameter[3],Parameter[4],mydata,title)
输出
MLPClassifier(鸢尾花数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 100.00% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 97.37% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 100.00% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 94.74% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 100.00% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 97.37% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 100.00% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 97.37% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 99.11% MLPClassifier(鸢尾花数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 97.37% MLPClassifier(红酒数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 99.25% MLPClassifier(红酒数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 93.33% MLPClassifier(红酒数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 100.00% MLPClassifier(红酒数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 91.11% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 30.08% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 24.44% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 75.19% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 71.11% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 100.00% MLPClassifier(红酒数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 93.33% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 96.48% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[100],activation:relu,level=1,alpha=0.0001: 95.80% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 62.68% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10],activation:relu,level=1,alpha=0.0001: 62.94% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 62.68% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:relu,level=2,alpha=0.0001: 62.94% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 96.24% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=0.0001: 92.31% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 99.53% MLPClassifier(乳腺癌数据):solver:lbfgs,node:[10, 10],activation:tanh,level=2,alpha=1: 95.80%
数据 |
solver |
node |
activation |
level |
alpha |
训练得分 |
测试得分 |
鸢尾花 |
lbfgs |
[100] |
relu |
1 |
0.0001 |
100.00% |
97.37% |
lbfgs |
[10] |
relu |
1 |
0.0001 |
100.00% |
94.74% |
|
lbfgs |
[10,10] |
relu |
2 |
0.0001 |
100.00% |
97.37% |
|
lbfgs |
[10,10] |
tanh |
2 |
0.0001 |
100.00% |
97.37% |
|
lbfgs |
[10,10] |
tanh |
2 |
1 |
99.11% |
97.37% |
|
红酒 |
lbfgs |
[100] |
relu |
1 |
0.0001 |
99.25% |
93.33% |
lbfgs |
[10] |
relu |
1 |
0.0001 |
100.00% |
91.11% |
|
lbfgs |
[10,10] |
relu |
2 |
0.0001 |
30.08% |
24.44% |
|
lbfgs |
[10,10] |
tanh |
2 |
0.0001 |
75.19% |
71.11% |
|
lbfgs |
[10,10] |
tanh |
2 |
1 |
100.00% |
93.33% |
|
乳腺癌 |
lbfgs |
[100] |
relu |
1 |
0.0001 |
96.48% |
95.80% |
lbfgs |
[10] |
relu |
1 |
0.0001 |
62.68% |
62.94% |
|
lbfgs |
[10,10] |
relu |
2 |
0.0001 |
62.68% |
62.94% |
|
lbfgs |
[10,10] |
tanh |
2 |
0.0001 |
96.24% |
92.31% |
|
lbfgs |
[10,10] |
tanh |
2 |
1 |
99.53% |
95.80% |
12.5.3案例:手写识别
from PIL import Image def writeing(): minist = datasets.fetch_openml('mnist_784') print("样本数量{},样本特征数:{}".format(minist.data.shape[0],minist.data.shape[1])) X = minist.data/255 y = minist.target # 10000训练样本, 5000测试样本 X_train,X_test,y_train,y_test = train_test_split(X, y, train_size=10000,test_size=5000,random_state=62) #隐藏层为2层,每层500节点 mlp = MLPClassifier(solver='lbfgs',hidden_layer_sizes=[500,500],activation='tanh',alpha=1e-5,random_state=62) mlp.fit(X_train,y_train) print("测试集得分:{:.2%}".format(mlp.score(X_test,y_test))) image = Image.open('4.png').convert('F') image.resize((28,28)) arr=[] for i in range(28): for j in range(28): piexl = 1.0-float(image.getpixel((j,i)))/255. arr.append(piexl) arr1 = np.array(arr).reshape(1,-1) print('图片中的数字是:{}'.format(mlp.predict(arr1)[0])) # 进行图像识别 plt.imshow(image) plt.show() print('图片中的数字是:{}'.format(mlp.predict(arr1)[0]))
输出
样本数量70000,样本特征数:784 测试集得分:95.16% 图片中的数字是:4