DL之DNN:自定义MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四种最优化)对MNIST数据集训练进而比较不同方法的性能

简介: DL之DNN:自定义MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四种最优化)对MNIST数据集训练进而比较不同方法的性能

输出结

image.png

===========iteration:0===========

SGD:2.289282108880558

Momentum:2.2858501933777964

AdaGrad:2.135969407893337

Adam:2.2214629551644443

===========iteration:100===========

SGD:1.549948593098733

Momentum:0.2630614409487161

AdaGrad:0.1280980906681204

Adam:0.21268580798960957

===========iteration:200===========

SGD:0.7668413651485669

Momentum:0.19974263379725932

AdaGrad:0.0688320187945635

Adam:0.12737004371824456

===========iteration:300===========

SGD:0.46630711328743457

Momentum:0.17680542175883507

AdaGrad:0.0580940990397764

Adam:0.12930303058268838

===========iteration:400===========

SGD:0.34526365067568743

Momentum:0.08914404106297127

AdaGrad:0.038093353912494965

Adam:0.06415424083978832

===========iteration:500===========

SGD:0.3588584559967853

Momentum:0.1299949652623088

AdaGrad:0.040978421988412894

Adam:0.058780880102566074

===========iteration:600===========

SGD:0.38273120367667224

Momentum:0.14074766142608885

AdaGrad:0.08641723451090685

Adam:0.11339321858037713

===========iteration:700===========

SGD:0.381094901742027

Momentum:0.1566582072807326

AdaGrad:0.08844650332208387

Adam:0.10485802139218811

===========iteration:800===========

SGD:0.25722603754213674

Momentum:0.07897119725740888

AdaGrad:0.04960128385990466

Adam:0.0835996553542796

===========iteration:900===========

SGD:0.33273148769731326

Momentum:0.19612162874621766

AdaGrad:0.03441995281224886

Adam:0.12248261979926914

===========iteration:1000===========

SGD:0.26394416793465253

Momentum:0.10157776537129978

AdaGrad:0.04761303979039287

Adam:0.046994040537976525

===========iteration:1100===========

SGD:0.23894569840123672

Momentum:0.09093030644899333

AdaGrad:0.07018006635107976

Adam:0.07879622117292093

===========iteration:1200===========

SGD:0.24382935069334477

Momentum:0.08324889705863456

AdaGrad:0.04484659272127939

Adam:0.0719509559060747

===========iteration:1300===========

SGD:0.21307958354960485

Momentum:0.07030166296163001

AdaGrad:0.022552468995955182

Adam:0.049860815437560935

===========iteration:1400===========

SGD:0.3110486414209358

Momentum:0.13117004626934742

AdaGrad:0.07351569965620054

Adam:0.09723751626189574

===========iteration:1500===========

SGD:0.2087589466947655

Momentum:0.09088929766254576

AdaGrad:0.027825434320282873

Adam:0.06352715244823183

===========iteration:1600===========

SGD:0.12783635178644553

Momentum:0.053366262737818

AdaGrad:0.012093087503155344

Adam:0.021385013278486315

===========iteration:1700===========

SGD:0.21476134194349975

Momentum:0.08453161462373757

AdaGrad:0.054955557126319256

Adam:0.035257261368372185

===========iteration:1800===========

SGD:0.3415964018415049

Momentum:0.13866704706781385

AdaGrad:0.04585298765046911

Adam:0.06437669858445684

===========iteration:1900===========

SGD:0.13530674587479818

Momentum:0.03958142222010819

AdaGrad:0.019096102635470277

Adam:0.02185864115092371


 

设计思

image.png

 

核心代

#T1、SGD算法

class SGD:

'……'

   def update(self, params, grads):

       for key in params.keys():

           params[key] -= self.lr * grads[key]

#T2、Momentum算法

import numpy as np

class Momentum:

'……'

   def update(self, params, grads):

       if self.v is None:

           self.v = {}

           for key, val in params.items():                                

               self.v[key] = np.zeros_like(val)

       for key in params.keys():

           self.v[key] = self.momentum*self.v[key] - self.lr*grads[key]

           params[key] += self.v[key]

#T3、AdaGrad算法

'……'

     

   def update(self, params, grads):

       if self.h is None:

           self.h = {}

           for key, val in params.items():

               self.h[key] = np.zeros_like(val)

       for key in params.keys():

           self.h[key] += grads[key] * grads[key]

           params[key] -= self.lr * grads[key] / (np.sqrt(self.h[key]) + 1e-7)  

#T4、Adam算法

'……'

     

   def update(self, params, grads):

       if self.m is None:

           self.m, self.v = {}, {}

           for key, val in params.items():

               self.m[key] = np.zeros_like(val)

               self.v[key] = np.zeros_like(val)

       self.iter += 1

       lr_t  = self.lr * np.sqrt(1.0 - self.beta2**self.iter) / (1.0 - self.beta1**self.iter)        

     

       for key in params.keys():

           self.m[key] += (1 - self.beta1) * (grads[key] - self.m[key])

           self.v[key] += (1 - self.beta2) * (grads[key]**2 - self.v[key])

         

           params[key] -= lr_t * self.m[key] / (np.sqrt(self.v[key]) + 1e-7)

networks = {}

train_loss = {}

for key in optimizers.keys():

   networks[key] = MultiLayerNet( input_size=784,  hidden_size_list=[10, 10, 10, 10],  output_size=10)

   train_loss[key] = []    

for i in range(max_iterations):

   batch_mask = np.random.choice(train_size, batch_size)

   x_batch = x_train[batch_mask]

   t_batch = t_train[batch_mask]

 

   for key in optimizers.keys():

       grads = networks[key].gradient(x_batch, t_batch)    

       optimizers[key].update(networks[key].params, grads)  

       loss = networks[key].loss(x_batch, t_batch)

       train_loss[key].append(loss)

   if i % 100 == 0:

       print( "===========" + "iteration:" + str(i) + "===========")

       for key in optimizers.keys():

           loss = networks[key].loss(x_batch, t_batch)

           print(key + ":" + str(loss))


相关文章
|
机器学习/深度学习 定位技术
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(三)
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(三)
331 2
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(三)
|
机器学习/深度学习
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(一)
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(一)
238 1
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(一)
|
机器学习/深度学习
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(二)
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(二)
226 0
DNN、CNN和RNN的12种主要dropout方法的数学和视觉解释(二)
|
机器学习/深度学习
DL之DNN:自定义MultiLayerNet【6*100+ReLU,SGD】对MNIST数据集训练进而比较【多个超参数组合最优化】性能
DL之DNN:自定义MultiLayerNet【6*100+ReLU,SGD】对MNIST数据集训练进而比较【多个超参数组合最优化】性能
DL之DNN:自定义MultiLayerNet【6*100+ReLU,SGD】对MNIST数据集训练进而比较【多个超参数组合最优化】性能
|
机器学习/深度学习
DL之DNN:利用MultiLayerNetExtend模型【6*100+ReLU+SGD,dropout】对Mnist数据集训练来抑制过拟合
DL之DNN:利用MultiLayerNetExtend模型【6*100+ReLU+SGD,dropout】对Mnist数据集训练来抑制过拟合
DL之DNN:利用MultiLayerNetExtend模型【6*100+ReLU+SGD,dropout】对Mnist数据集训练来抑制过拟合
|
机器学习/深度学习
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD,weight_decay】对Mnist数据集训练来抑制过拟合
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD,weight_decay】对Mnist数据集训练来抑制过拟合
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD,weight_decay】对Mnist数据集训练来抑制过拟合
|
机器学习/深度学习
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD】对Mnist数据集训练来理解过拟合现象
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD】对Mnist数据集训练来理解过拟合现象
DL之DNN:利用MultiLayerNet模型【6*100+ReLU+SGD】对Mnist数据集训练来理解过拟合现象
|
机器学习/深度学习 算法 PyTorch
OpenCV-图像着色(采用DNN模块导入深度学习模型)
OpenCV-图像着色(采用DNN模块导入深度学习模型)
190 0
来自OpenCv的DNN模块助力图像分类任务
来自OpenCv的DNN模块助力图像分类任务
198 0
来自OpenCv的DNN模块助力图像分类任务
|
机器学习/深度学习 SEO
介绍几个DNN SEO模块,可免费试用的
iFinity Url Master - Get the best SEO results by taking control of your DNN urls iFinity Tagger - Tag your DNN content and create specific, target...
635 0

热门文章

最新文章

相关实验场景

更多
下一篇
DataWorks