目录
六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测
相关文章
ML之分类预测:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类模型评估案例来理解和认知机器学习分类预测
ML之分类预测:以六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类模型评估案例来理解和认知机器学习分类预测应用
六类机器学习算法(kNN、逻辑回归、SVM、决策树、随机森林、提升树、神经网络)对糖尿病数据集(8→1)实现二分类预测
数据集理解
1. data.shape: (768, 9) 2. data.columns: 3. Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 4. 'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome'], 5. dtype='object') 6. data.head: 7. Pregnancies Glucose BloodPressure ... DiabetesPedigreeFunction Age Outcome 8. 0 6 148 72 ... 0.627 50 1 9. 1 1 85 66 ... 0.351 31 0 10. 2 8 183 64 ... 0.672 32 1 11. 3 1 89 66 ... 0.167 21 0 12. 4 0 137 40 ... 2.288 33 1 13. 14. [5 rows x 9 columns] 15. <class 'pandas.core.frame.DataFrame'> 16. RangeIndex: 768 entries, 0 to 767 17. Data columns (total 9 columns): 18. # Column Non-Null Count Dtype 19. --- ------ -------------- ----- 20. 0 Pregnancies 768 non-null int64 21. 1 Glucose 768 non-null int64 22. 2 BloodPressure 768 non-null int64 23. 3 SkinThickness 768 non-null int64 24. 4 Insulin 768 non-null int64 25. 5 BMI 768 non-null float64 26. 6 DiabetesPedigreeFunction 768 non-null float64 27. 7 Age 768 non-null int64 28. 8 Outcome 768 non-null int64 29. dtypes: float64(2), int64(7) 30. memory usage: 54.1 KB 31. data.info: 32. None 33. 8 34. data_column_X: ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age'] 35. ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age']
1、kNN
1. kNNC(n_neighbors=9):Training set accuracy: 0.792 2. kNNC(n_neighbors=9):Test set accuracy: 0.776
2、逻辑回归
1. LoR(c_regular=1):Training set accuracy: 0.785 2. LoR(c_regular=1):Test set accuracy: 0.771
3、SVM
1. SVMC_Init:Training set accuracy: 0.769 2. SVMC_Init:Test set accuracy: 0.755 3. SVMC_Best(max_dept=1,learning_rate=0.1):Training set accuracy: 0.788 4. SVMC_Best(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781 5. DTC(max_dept=3):Training set accuracy: 0.773 6. DTC(max_dept=3):Test set accuracy: 0.740
4、决策树
1. DTC(max_dept=3):Training set accuracy: 0.773 2. DTC(max_dept=3):Test set accuracy: 0.740
5、随机森林
1. RFC_Best:Training set accuracy: 0.764 2. RFC_Best:Test set accuracy: 0.750
6、提升树
1. GBC(max_dept=1,learning_rate=0.1):Training set accuracy: 0.804 2. GBC(max_dept=1,learning_rate=0.1):Test set accuracy: 0.781
7、神经网络
1. MLPC_Init:Training set accuracy: 0.743 2. MLPC_Init:Test set accuracy: 0.672