tebsorflow2.0 tf.keras序列问题

简介: 本节是主要介绍的是序列问题的处理,采用的数据集为电影评论数据集,我们通过keras.datasets.imdb直接导入,之后我们建立模型,对其进行分类,并处理过拟合问题。

1. 数据预处理

from google.colab import drive
drive.mount('/content/gdrive')
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
data = keras.datasets.imdb
#加载数据
(x_train,y_train),(x_test,y_test)= data.load_data(num_words=10000)

我们发现这些评论的序列长短不一,为了便于后面神经网络的训练,我们需要对较长的序列进行截断,对较短的序列进行填充。我们可以使用keras.preprocessing.sequence.pad_sequences方法对其进行填充。

print("截断前前5个样本序列长度:{}".format([len(x) for x in x_train[:5]]))
x_train = keras.preprocessing.sequence.pad_sequences(x_train,300)
x_test = keras.preprocessing.sequence.pad_sequences(x_test,300)
print("截断后前5个样本序列长度:{}".format([len(x) for x in x_test[:5]]))
截断前前5个样本序列长度:[218, 189, 141, 550, 147]
截断后前5个样本序列长度:[300, 300, 300, 300, 300]
• 1
• 2

2. 构建模型

model = keras.Sequential()
#输出为(None.300,50),意思是将这300个数据转化为300个50维的特征向量
model.add(layers.Embedding(10000,50,input_length=300))
model.add(layers.Flatten())
model.add(layers.Dense(128,activation= "relu"))
y = model.add(layers.Dense(1,activation="sigmoid"))
model.summary()
model.compile(optimizer = keras.optimizers.Adam(lr=0.001),
             loss = "binary_crossentropy",
              metrics = ["acc"]
             )
history = model.fit(x_train,y_train,epochs=10,batch_size=256,validation_data=(x_test,y_test))
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 50)           500000    
_________________________________________________________________
flatten (Flatten)            (None, 15000)             0         
_________________________________________________________________
dense (Dense)                (None, 128)               1920128   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
=================================================================
Total params: 2,420,257
Trainable params: 2,420,257
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
98/98 [==============================] - 3s 33ms/step - loss: 0.5086 - acc: 0.7221 - val_loss: 0.2980 - val_acc: 0.8746
Epoch 2/10
98/98 [==============================] - 3s 32ms/step - loss: 0.1643 - acc: 0.9394 - val_loss: 0.3016 - val_acc: 0.8769
Epoch 3/10
98/98 [==============================] - 3s 32ms/step - loss: 0.0391 - acc: 0.9924 - val_loss: 0.3552 - val_acc: 0.8732
Epoch 4/10
98/98 [==============================] - 3s 33ms/step - loss: 0.0079 - acc: 0.9995 - val_loss: 0.3952 - val_acc: 0.8736
Epoch 5/10
98/98 [==============================] - 3s 34ms/step - loss: 0.0027 - acc: 0.9999 - val_loss: 0.4250 - val_acc: 0.8734
Epoch 6/10
98/98 [==============================] - 3s 34ms/step - loss: 0.0014 - acc: 1.0000 - val_loss: 0.4462 - val_acc: 0.8742
Epoch 7/10
98/98 [==============================] - 3s 32ms/step - loss: 8.4250e-04 - acc: 1.0000 - val_loss: 0.4639 - val_acc: 0.8747
Epoch 8/10
98/98 [==============================] - 3s 31ms/step - loss: 5.9368e-04 - acc: 1.0000 - val_loss: 0.4794 - val_acc: 0.8742
Epoch 9/10
98/98 [==============================] - 3s 33ms/step - loss: 4.3011e-04 - acc: 1.0000 - val_loss: 0.4925 - val_acc: 0.8744
Epoch 10/10
98/98 [==============================] - 3s 34ms/step - loss: 3.2738e-04 - acc: 1.0000 - val_loss: 0.5045 - val_acc: 0.8744

3. 模型分析

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("acc"),label="acc")
plt.plot(history.epoch, history.history.get("val_acc"),label="val_acc")
plt.legend()
plt.show()

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("loss"),label="loss")
plt.plot(history.epoch, history.history.get("val_loss"),label="val_loss")
plt.legend()
plt.show()

训练数据的准确率远高于测试数据的准确率,这是过拟合的问题,我们可以添加dropout,l2正则化等方法解决过拟合。

4. 模型的优化

from tensorflow.keras import regularizers
model = keras.Sequential()
#输出为(None.300,50),意思是将这300个数据转化为300个50维的特征向量
model.add(layers.Embedding(10000,50,input_length=300,embeddings_regularizer=regularizers.l2(0.001)))
model.add(layers.GlobalAvgPool1D())
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(256,activation= "relu",kernel_regularizer=regularizers.l2(0.001)))
model.add(layers.BatchNormalization())
model.add(layers.Dropout(0.5))
y = model.add(layers.Dense(1,activation="sigmoid",kernel_regularizer=regularizers.l2(0.001)))
model.summary()
model.compile(optimizer = keras.optimizers.Adam(lr=0.001),
             loss = "binary_crossentropy",
             metrics = ["acc"]
             )
history = model.fit(x_train,y_train,epochs=10,batch_size=256,validation_data=(x_test,y_test))
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 300, 50)           500000    
_________________________________________________________________
global_average_pooling1d (Gl (None, 50)                0         
_________________________________________________________________
batch_normalization (BatchNo (None, 50)                200       
_________________________________________________________________
dropout (Dropout)            (None, 50)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 256)               13056     
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 257       
=================================================================
Total params: 514,537
Trainable params: 513,925
Non-trainable params: 612
_________________________________________________________________
98/98 [==============================] - 3s 26ms/step - loss: 0.9417 - acc: 0.6739 - val_loss: 0.8925 - val_acc: 0.5110
Epoch 2/10
98/98 [==============================] - 2s 24ms/step - loss: 0.5110 - acc: 0.8672 - val_loss: 0.8516 - val_acc: 0.5066
Epoch 3/10
98/98 [==============================] - 2s 25ms/step - loss: 0.4200 - acc: 0.8984 - val_loss: 0.8033 - val_acc: 0.6794
Epoch 4/10
98/98 [==============================] - 3s 26ms/step - loss: 0.3850 - acc: 0.9087 - val_loss: 0.7196 - val_acc: 0.8363
Epoch 5/10
98/98 [==============================] - 2s 24ms/step - loss: 0.3500 - acc: 0.9196 - val_loss: 0.6267 - val_acc: 0.7583
Epoch 6/10
98/98 [==============================] - 2s 25ms/step - loss: 0.3360 - acc: 0.9216 - val_loss: 0.5075 - val_acc: 0.8535
Epoch 7/10
98/98 [==============================] - 2s 25ms/step - loss: 0.3158 - acc: 0.9280 - val_loss: 0.4609 - val_acc: 0.8490
Epoch 8/10
98/98 [==============================] - 2s 24ms/step - loss: 0.3039 - acc: 0.9316 - val_loss: 0.4430 - val_acc: 0.8666
Epoch 9/10
98/98 [==============================] - 2s 24ms/step - loss: 0.2975 - acc: 0.9316 - val_loss: 0.4554 - val_acc: 0.8658
Epoch 10/10
98/98 [==============================] - 2s 24ms/step - loss: 0.2858 - acc: 0.9362 - val_loss: 0.4749 - val_acc: 0.8608
import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("acc"),label="acc")
plt.plot(history.epoch, history.history.get("val_acc"),label="val_acc")
plt.legend()
plt.show()

import matplotlib.pyplot as plt
plt.plot(history.epoch, history.history.get("loss"),label="loss")
plt.plot(history.epoch, history.history.get("val_loss"),label="val_loss")
plt.legend()
plt.show()

从图中我们可以知道加入dropout,l2正则换,批量归一化具有抑制过拟合的作用。

相关文章
|
3月前
|
机器学习/深度学习 Python
tf.keras.layers.LSTM/tf.keras.layers.LSTMCell
【8月更文挑战第20天】tf.keras.layers.LSTM/tf.keras.layers.LSTMCell。
38 1
|
5月前
|
PyTorch 算法框架/工具
【chat-gpt问答记录】torch.tensor和torch.Tensor什么区别?
【chat-gpt问答记录】torch.tensor和torch.Tensor什么区别?
149 2
|
3月前
tf.keras.layers.Dense
【8月更文挑战第20天】tf.keras.layers.Dense。
63 2
|
3月前
tf.keras.layers.MaxPooling2D/AveragePooling2D
【8月更文挑战第19天】tf.keras.layers.MaxPooling2D/AveragePooling2D。
45 0
|
3月前
|
算法框架/工具 数据格式
tf.keras.layers.Conv2D
【8月更文挑战第20天】tf.keras.layers.Conv2D。
40 2
|
3月前
|
机器学习/深度学习 TensorFlow 算法框架/工具
【Tensorflow+Keras】tf.keras.layers.Bidirectional()的解析与使用
本文解析了TensorFlow和Keras中的`tf.keras.layers.Bidirectional()`层,它用于实现双向RNN(如LSTM、GRU)的神经网络结构。文章详细介绍了该层的参数配置,并通过实例演示了如何构建含有双向LSTM层的模型,以及如何使用IMDB数据集进行模型训练和评估。
86 8
|
6月前
|
算法
TF-IDF算法是什么呢?
TF-IDF(Term Frequency-Inverse Document Frequency)是一种常用于信息检索和文本挖掘的统计方法,用于评估一个词在文档集或一个语料库中的重要程度。TF-IDF是一种统计方法,用以评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度。字词的重要性随着它在文件中出现的次数成正比增加,但同时会随着它在语料库中出现的频率成反比下降。
|
4月前
|
自然语言处理 算法 搜索推荐
|
6月前
|
机器学习/深度学习 PyTorch 算法框架/工具
【PyTorch】-了解张量(Tensor)
【PyTorch】-了解张量(Tensor)
|
PyTorch 算法框架/工具
【PyTorch简明教程】torch.Tensor()与torch.tensor()的区别
【PyTorch简明教程】torch.Tensor()与torch.tensor()的区别
134 0