ML之DT：基于DT决策树算法(对比是否经特征筛选FS处理)对Titanic(泰坦尼克号)数据集进行二分类预测-阿里云开发者社区

ML之DT：基于DT决策树算法(对比是否经特征筛选FS处理)对Titanic(泰坦尼克号)数据集进行二分类预测

2021-10-30 263

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： ML之DT：基于DT决策树算法(对比是否经特征筛选FS处理)对Titanic(泰坦尼克号)数据集进行二分类预测

输出结果

初步处理后的 X_train： (984, 474)

(0, 0) 31.19418104265403

(0, 78) 1.0

(0, 82) 1.0

(0, 366) 1.0

(0, 391) 1.0

(0, 435) 1.0

(0, 437) 1.0

(0, 473) 1.0

(1, 0) 31.19418104265403

(1, 73) 1.0

(1, 79) 1.0

(1, 296) 1.0

(1, 389) 1.0

(1, 397) 1.0

(1, 436) 1.0

(1, 446) 1.0

(2, 0) 31.19418104265403

(2, 78) 1.0

(2, 82) 1.0

(2, 366) 1.0

(2, 391) 1.0

(2, 435) 1.0

(2, 437) 1.0

(2, 473) 1.0

(3, 0) 32.0

: :

(980, 473) 1.0

(981, 0) 12.0

(981, 73) 1.0

(981, 81) 1.0

(981, 84) 1.0

(981, 390) 1.0

(981, 435) 1.0

(981, 436) 1.0

(981, 473) 1.0

(982, 0) 18.0

(982, 78) 1.0

(982, 81) 1.0

(982, 277) 1.0

(982, 390) 1.0

(982, 435) 1.0

(982, 437) 1.0

(982, 473) 1.0

(983, 0) 31.19418104265403

(983, 78) 1.0

(983, 82) 1.0

(983, 366) 1.0

(983, 391) 1.0

(983, 435) 1.0

(983, 436) 1.0

(983, 473) 1.0 经过FS处理后的 X_train_fs： (984, 94)

(0, 93) 1.0

(0, 85) 1.0

(0, 83) 1.0

(0, 76) 1.0

(0, 71) 1.0

(0, 27) 1.0

(0, 24) 1.0

(0, 0) 31.19418104265403

(1, 84) 1.0

(1, 74) 1.0

(1, 63) 1.0

(1, 25) 1.0

(1, 19) 1.0

(1, 0) 31.19418104265403

(2, 93) 1.0

(2, 85) 1.0

(2, 83) 1.0

(2, 76) 1.0

(2, 71) 1.0

(2, 27) 1.0

(2, 24) 1.0

(2, 0) 31.19418104265403

(3, 93) 1.0

(3, 85) 1.0

(3, 83) 1.0

: :

(980, 24) 1.0

(980, 0) 31.19418104265403

(981, 93) 1.0

(981, 84) 1.0

(981, 83) 1.0

(981, 75) 1.0

(981, 28) 1.0

(981, 26) 1.0

(981, 19) 1.0

(981, 0) 12.0

(982, 93) 1.0

(982, 85) 1.0

(982, 83) 1.0

(982, 75) 1.0

(982, 26) 1.0

(982, 24) 1.0

(982, 0) 18.0

(983, 93) 1.0

(983, 84) 1.0

(983, 83) 1.0

(983, 76) 1.0

(983, 71) 1.0

(983, 27) 1.0

(983, 24) 1.0

(983, 0) 31.19418104265403

设计思路

核心代码

class SelectPercentile Found at: sklearn.feature_selection.univariate_selection

class SelectPercentile(_BaseFilter):

"""Select features according to a percentile of the highest scores.

Read more in the :ref:`User Guide <univariate_feature_selection>`.

Parameters

----------

score_func : callable

Function taking two arrays X and y, and returning a pair of arrays

(scores, pvalues) or a single array with scores.

Default is f_classif (see below "See also"). The default function only

works with classification tasks.

percentile : int, optional, default=10

Percent of features to keep.

Attributes

----------

scores_ : array-like, shape=(n_features,)

Scores of features.

pvalues_ : array-like, shape=(n_features,)

p-values of feature scores, None if `score_func` returned only scores.

Notes

-----

Ties between features with equal scores will be broken in an unspecified

way.

ML之DT：基于DT决策树算法(对比是否经特征筛选FS处理)对Titanic(泰坦尼克号)数据集进行二分类预测

输出结果

设计思路

核心代码

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

ML之DT：基于DT决策树算法(对比是否经特征筛选FS处理)对Titanic(泰坦尼克号)数据集进行二分类预测

输出结果

设计思路

核心代码

热门文章

最新文章

相关课程

相关电子书

相关实验场景