RNN-心脏病预测-阿里云开发者社区

RNN-心脏病预测

2023-01-09 248 发布于山东

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： RNN-心脏病预测

一、前期准备

1. 设置GPU

import tensorflow        as tf
gpus = tf.config.list_physical_devices("GPU")
if gpus:
    gpu0 = gpus[0]                                        #如果有多个GPU，仅使用第0个GPU
    tf.config.experimental.set_memory_growth(gpu0, True)  #设置GPU显存用量按需使用
    tf.config.set_visible_devices([gpu0],"GPU")
gpus

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

2. 导入数据

🥂 数据介绍：

age：1) 年龄

sex：2) 性别

cp：3) 胸痛类型 (4 values)

trestbps：4) 静息血压

chol：5) 血清胆甾醇 (mg/dl

fbs：6) 空腹血糖 > 120 mg/dl

restecg：7) 静息心电图结果 (值 0,1 ,2)

thalach：8) 达到的最大心率

exang：9) 运动诱发的心绞痛

oldpeak：10) 相对于静止状态，运动引起的ST段压低

slope：11) 运动峰值 ST 段的斜率

ca：12) 荧光透视着色的主要血管数量 (0-3)

thal：13) 0 = 正常；1 = 固定缺陷；2 = 可逆转的缺陷

target：14) 0 = 心脏病发作的几率较小 1 = 心脏病发作的几率更大

import pandas as pd
import numpy as np
df = pd.read_csv("heart.csv")
df

3. 检查数据

# 检查是否有空值
df.isnull().sum()

age 0

sex 0

cp 0

trestbps 0

chol 0

fbs 0

restecg 0

thalach 0

exang 0

oldpeak 0

slope 0

ca 0

thal 0

target 0

dtype: int64

二、数据预处理

1. 划分训练集与测试集

🍺 测试集与验证集的关系：

验证集并没有参与训练过程梯度下降过程的，狭义上来讲是没有参与模型的参数训练更新的。

但是广义上来讲，验证集存在的意义确实参与了一个“人工调参”的过程，我们根据每一个epoch训练之后模型在valid data上的表现来决定是否需要训练进行early stop，或者根据这个过程模型的性能变化来调整模型的超参数，如学习率，batch_size等等。

我们也可以认为，验证集也参与了训练，但是并没有使得模型去overfit验证集。

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
X = df.iloc[:,:-1]
y = df.iloc[:,-1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.1, random_state = 1)

X_train.shape, y_train.shape

((272, 13), (272,))

2. 标准化

# 将每一列特征标准化为标准正态分布，注意，标准化是针对每一列而言的
sc      = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test  = sc.transform(X_test)
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test  = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

三、构建RNN模型

⭐函数原型

tf.keras.layers.SimpleRNN(units,activation='tanh',use_bias=True,kernel_initializer='glorot_uniform',recurrent_initializer='orthogonal',bias_initializer='zeros',kernel_regularizer=None,recurrent_regularizer=None,bias_regularizer=None,activity_regularizer=None,kernel_constraint=None,recurrent_constraint=None,bias_constraint=None,dropout=0.0,recurrent_dropout=0.0,return_sequences=False,return_state=False,go_backwards=False,stateful=False,unroll=False,**kwargs)

关键参数说明

● units: 正整数，输出空间的维度。

● activation: 要使用的激活函数。默认：双曲正切（tanh）。如果传入 None，则不使用激活函数 (即线性激活：a(x) = x)。

● use_bias: 布尔值，该层是否使用偏置向量。

● kernel_initializer: kernel 权值矩阵的初始化器，用于输入的线性转换 (详见 initializers)。

● recurrent_initializer: recurrent_kernel 权值矩阵的初始化器，用于循环层状态的线性转换 (详见 initializers)。

● bias_initializer:偏置向量的初始化器 (详见initializers).

● dropout: 在 0 和 1 之间的浮点数。单元的丢弃比例，用于输入的线性转换。

import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,LSTM,SimpleRNN
model = Sequential()
model.add(SimpleRNN(200, input_shape= (13,1), activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn (SimpleRNN)       (None, 200)               40400     
_________________________________________________________________
dense (Dense)                (None, 100)               20100     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 101       
=================================================================
Total params: 60,601
Trainable params: 60,601
Non-trainable params: 0
________________________________________________________________

四、编译模型

opt = tf.keras.optimizers.Adam(learning_rate=1e-4)
model.compile(loss='binary_crossentropy',
              optimizer=opt,
              metrics="accuracy")

五、训练模型

epochs = 100
history = model.fit(X_train, y_train, 
                    epochs=epochs, 
                    batch_size=128, 
                    validation_data=(X_test, y_test),
                    verbose=1)

Epoch 97/100
3/3 [==============================] - 0s 25ms/step - loss: 0.2783 - accuracy: 0.8929 - val_loss: 0.3175 - val_accuracy: 0.8710
Epoch 98/100
3/3 [==============================] - 0s 25ms/step - loss: 0.2559 - accuracy: 0.9036 - val_loss: 0.3163 - val_accuracy: 0.8710
Epoch 99/100
3/3 [==============================] - 0s 24ms/step - loss: 0.2658 - accuracy: 0.8863 - val_loss: 0.3143 - val_accuracy: 0.9032
Epoch 100/100
3/3 [==============================] - 0s 24ms/step - loss: 0.2728 - accuracy: 0.8842 - val_loss: 0.3081 - val_accuracy: 0.8710

六、模型评估

import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(14, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

scores = model.evaluate(X_test, y_test, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

accuracy: 87.10%

RNN-心脏病预测

一、前期准备

1. 设置GPU

2. 导入数据

3. 检查数据

二、数据预处理

1. 划分训练集与测试集

2. 标准化

三、构建RNN模型

四、编译模型

五、训练模型

六、模型评估

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

RNN-心脏病预测

一、前期准备

1. 设置GPU

2. 导入数据

3. 检查数据

二、数据预处理

1. 划分训练集与测试集

2. 标准化

三、构建RNN模型

四、编译模型

五、训练模型

六、模型评估

热门文章

最新文章

相关课程

相关电子书

相关实验场景