tebsorflow2.0 tf.keras序列问题

简介: 本节是主要介绍的是序列问题的处理,采用的数据集为电影评论数据集,我们通过keras.datasets.imdb直接导入,之后我们建立模型,对其进行分类,并处理过拟合问题。

1. 数据预处理

from google.colab import drive
drive.mount('/content/gdrive')
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
data = keras.datasets.imdb
#加载数据
(x_train,y_train),(x_test,y_test)= data.load_data(num_words=10000)

我们发现这些评论的序列长短不一,为了便于后面神经网络的训练,我们需要对较长的序列进行截断,对较短的序列进行填充。我们可以使用keras.preprocessing.sequence.pad_sequences方法对其进行填充。

print("截断前前5个样本序列长度:{}".format([len(x) for x in x_train[:5]]))
x_train = keras.preprocessing.sequence.pad_sequences(x_train,300)
x_test = keras.preprocessing.sequence.pad_sequences(x_test,300)
print("截断后前5个样本序列长度:{}".format([len(x) for x in x_test[:5]]))
截断前前5个样本序列长度:[218, 189, 141, 550, 147]
截断后前5个样本序列长度:[300, 300, 300, 300, 300]
• 1
• 2

2. 构建模型

model = keras.Sequential()
#输出为(None.300,50),意思是将这300个数据转化为300个50维的特征向量
model.add(layers.Embedding(10000,50,input_length=300))
model.add(layers.Flatten())
model.add(layers.Dense(128,activation= "relu"))
y = model.add(layers.Dense(1,activation="sigmoid"))
model.summary()
model.compile(optimizer = keras.optimizers.Adam(lr=0.001),
             loss = "binary_crossentropy",
              metrics = ["acc"]
             )
history = model.fit(x_train,y_train,epochs=10,batch_size=256,validation_data=(x_test,y_test))
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 50)           500000    
_________________________________________________________________
flatten (Flatten)            (None, 15000)             0         
_________________________________________________________________
dense (Dense)                (None, 128)               1920128   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
=================================================================
Total params: 2,420,257
Trainable params: 2,420,257
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
98/98 [==============================] - 3s 33ms/step - loss: 0.5086 - acc: 0.7221 - val_loss: 0.2980 - val_acc: 0.8746
Epoch 2/10
98/98 [==============================] - 3s 32ms/step - loss: 0.1643 - acc: 0.9394 - val_loss: 0.3016 - val_acc: 0.8769
Epoch 3/10
98/98 [==============================] - 3s 32ms/step - loss: 0.0391 - acc: 0.9924 - val_loss: 0.3552 - val_acc: 0.8732
Epoch 4/10
98/98 [==============================] - 3s 33ms/step - loss: 0.0079 - acc: 0.9995 - val_loss: 0.3952 - val_acc: 0.8736
Epoch 5/10
98/98 [==============================] - 3s 34ms/step - loss: 0.0027 - acc: 0.9999 - val_loss: 0.4250 - val_acc: 0.8734
Epoch 6/10
98/98 [==============================] - 3s 34ms/step - loss: 0.0014 - acc: 1.0000 - val_loss: 0.4462 - val_acc: 0.8742
Epoch 7/10
98/98 [==============================] - 3s 32ms/step - loss: 8.4250e-04 - acc: 1.0000 - val_loss: 0.4639 - val_acc: 0.8747
Epoch 8/10
98/98 [==============================] - 3s 31ms/step - loss: 5.9368e-04 - acc: 1.0000 - val_loss: 0.4794 - val_acc: 0.8742
Epoch 9/10
98/98 [==============================] - 3s 33ms/step - loss: 4.3011e-04 - acc: 1.0000 - val_loss: 0.4925 - val_acc: 0.8744
Epoch 10/10
98/98 [==============================] - 3s 34ms/step - loss: 3.2738e-04 - acc: 1.0000 - val_loss: 0.5045 - val_acc: 0.8744

3. 模型分析

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("acc"),label="acc")
plt.plot(history.epoch, history.history.get("val_acc"),label="val_acc")
plt.legend()
plt.show()

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("loss"),label="loss")
plt.plot(history.epoch, history.history.get("val_loss"),label="val_loss")
plt.legend()
plt.show()

训练数据的准确率远高于测试数据的准确率,这是过拟合的问题,我们可以添加dropout,l2正则化等方法解决过拟合。

4. 模型的优化

from tensorflow.keras import regularizers
model = keras.Sequential()
#输出为(None.300,50),意思是将这300个数据转化为300个50维的特征向量
model.add(layers.Embedding(10000,50,input_length=300,embeddings_regularizer=regularizers.l2(0.001)))
model.add(layers.GlobalAvgPool1D())
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(256,activation= "relu",kernel_regularizer=regularizers.l2(0.001)))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
y = model.add(layers.Dense(1,activation="sigmoid",kernel_regularizer=regularizers.l2(0.001)))
model.summary()
model.compile(optimizer = keras.optimizers.Adam(lr=0.001),
             loss = "binary_crossentropy",
             metrics = ["acc"]
             )
history = model.fit(x_train,y_train,epochs=10,batch_size=256,validation_data=(x_test,y_test))
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 300, 50)           500000    
_________________________________________________________________
global_average_pooling1d (Gl (None, 50)                0         
_________________________________________________________________
batch_normalization (BatchNo (None, 50)                200       
_________________________________________________________________
dropout (Dropout)            (None, 50)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 256)               13056     
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 257       
=================================================================
Total params: 514,537
Trainable params: 513,925
Non-trainable params: 612
_________________________________________________________________
98/98 [==============================] - 3s 26ms/step - loss: 0.9417 - acc: 0.6739 - val_loss: 0.8925 - val_acc: 0.5110
Epoch 2/10
98/98 [==============================] - 2s 24ms/step - loss: 0.5110 - acc: 0.8672 - val_loss: 0.8516 - val_acc: 0.5066
Epoch 3/10
98/98 [==============================] - 2s 25ms/step - loss: 0.4200 - acc: 0.8984 - val_loss: 0.8033 - val_acc: 0.6794
Epoch 4/10
98/98 [==============================] - 3s 26ms/step - loss: 0.3850 - acc: 0.9087 - val_loss: 0.7196 - val_acc: 0.8363
Epoch 5/10
98/98 [==============================] - 2s 24ms/step - loss: 0.3500 - acc: 0.9196 - val_loss: 0.6267 - val_acc: 0.7583
Epoch 6/10
98/98 [==============================] - 2s 25ms/step - loss: 0.3360 - acc: 0.9216 - val_loss: 0.5075 - val_acc: 0.8535
Epoch 7/10
98/98 [==============================] - 2s 25ms/step - loss: 0.3158 - acc: 0.9280 - val_loss: 0.4609 - val_acc: 0.8490
Epoch 8/10
98/98 [==============================] - 2s 24ms/step - loss: 0.3039 - acc: 0.9316 - val_loss: 0.4430 - val_acc: 0.8666
Epoch 9/10
98/98 [==============================] - 2s 24ms/step - loss: 0.2975 - acc: 0.9316 - val_loss: 0.4554 - val_acc: 0.8658
Epoch 10/10
98/98 [==============================] - 2s 24ms/step - loss: 0.2858 - acc: 0.9362 - val_loss: 0.4749 - val_acc: 0.8608
import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("acc"),label="acc")
plt.plot(history.epoch, history.history.get("val_acc"),label="val_acc")
plt.legend()
plt.show()

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("loss"),label="loss")
plt.plot(history.epoch, history.history.get("val_loss"),label="val_loss")
plt.legend()
plt.show()

从图中我们可以知道加入dropout,l2正则换,批量归一化具有抑制过拟合的作用。

相关文章
|
2天前
|
机器学习/深度学习 PyTorch 算法框架/工具
【PyTorch】-了解张量(Tensor)
【PyTorch】-了解张量(Tensor)
|
4月前
|
算法
TF-IDF算法是什么呢?
TF-IDF(Term Frequency-Inverse Document Frequency)是一种常用于信息检索和文本挖掘的统计方法,用于评估一个词在文档集或一个语料库中的重要程度。TF-IDF是一种统计方法,用以评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度。字词的重要性随着它在文件中出现的次数成正比增加,但同时会随着它在语料库中出现的频率成反比下降。
|
13天前
|
机器学习/深度学习 自然语言处理 算法
python在Keras中使用LSTM解决序列问题
python在Keras中使用LSTM解决序列问题
30 0
|
10月前
|
PyTorch 算法框架/工具
【PyTorch简明教程】torch.Tensor()与torch.tensor()的区别
【PyTorch简明教程】torch.Tensor()与torch.tensor()的区别
84 0
|
10月前
|
机器学习/深度学习 存储 PyTorch
【Pytorch】 理解张量Tensor
【Pytorch】 理解张量Tensor
115 0
|
TensorFlow 算法框架/工具 Python
TensorFlow将Tensor或者numpy数据变成迭代器
TensorFlow将Tensor或者numpy数据变成迭代器
83 0
|
机器学习/深度学习 存储 PyTorch
|
机器学习/深度学习 算法 TensorFlow
Tf.keras 实现线性回归 | 学习笔记
快速学习 Tf.keras 实现线性回归
115 0
Tf.keras 实现线性回归 | 学习笔记
|
机器学习/深度学习 并行计算 数据挖掘
PyTorch : 了解Tensor(张量)及其创建方法
PyTorch : 了解Tensor(张量)及其创建方法
115 0
PyTorch : 了解Tensor(张量)及其创建方法
|
算法
TF-IDF算法
TF-IDF(Term Frequency-Inverse Document Frequency, 词频-逆文件频率).
153 0
TF-IDF算法

热门文章

最新文章