ML之RF：利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)-阿里云开发者社区

ML之RF：利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

2021-11-06 409

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： ML之RF：利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

数据说明

该数据集是葡萄牙银行机构进行营销活动所得。这些营销活动一般以电话为基础，银行的客服人员至少联系客户一次，以确认客户是否有意愿购买该银行的产品（定期存款）。目标是预测客户是否购买该银行的产品。

NO 字段名称数据类型字段描述

1 ID Int 客户唯一标识

2 age Int 客户年龄

3 job String 客户的职业

4 marital String 婚姻状况

5 education String 受教育水平

6 default String 是否有违约记录

7 balance Int 每年账户的平均余额

8 housing String 是否有住房贷款

9 loan String 是否有个人贷款

10 contact String 与客户联系的沟通方式

11 day Int 最后一次联系的时间（几号）

12 month String 最后一次联系的时间（月份）

13 duration Int 最后一次联系的交流时长

14 campaign Int 在本次活动中，与该客户交流过的次数

15 pdays Int 距离上次活动最后一次联系该客户，过去了多久（999表示没有联系过）

16 previous Int 在本次活动之前，与该客户交流过的次数

17 poutcome String 上一次活动的结果

18 y Int 预测客户是否会订购定期存款业务

数据参考：Citation: [Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

输出结果

查看数据分布

分析数据

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 ID 25317 non-null int64

1 age 25317 non-null int64

2 job 25317 non-null object

3 marital 25317 non-null object

4 education 25317 non-null object

5 default 25317 non-null object

6 balance 25317 non-null int64

7 housing 25317 non-null object

8 loan 25317 non-null object

9 contact 25317 non-null object

10 day 25317 non-null int64

11 month 25317 non-null object

12 duration 25317 non-null int64

13 campaign 25317 non-null int64

14 pdays 25317 non-null int64

15 previous 25317 non-null int64

16 poutcome 25317 non-null object

17 y 25317 non-null int64

dtypes: int64(9), object(9)

memory usage: 3.5+ MB

训练集计算相关系数：

y 1.000000

ID 0.556627

duration 0.394746

pdays 0.107565

previous 0.088337

campaign 0.075173

balance 0.057564

day 0.031886

age 0.029916

训练集 y标签的比例： 0.11695698542481336

依次查看训练集、测试集中，类别型字段的细分类

job ['admin.', 'blue-collar', 'entrepreneur', 'housemaid', 'management', 'retired', 'self-employed', 'services', 'student', 'technician', 'unemployed', 'unknown']

marital ['divorced', 'married', 'single']

education ['primary', 'secondary', 'tertiary', 'unknown']

default ['no', 'yes']

housing ['no', 'yes']

loan ['no', 'yes']

contact ['cellular', 'telephone', 'unknown']

month ['apr', 'aug', 'dec', 'feb', 'jan', 'jul', 'jun', 'mar', 'may', 'nov', 'oct', 'sep']

poutcome ['failure', 'other', 'success', 'unknown']

输出训练过程

Fitting 7 folds for each of 32 candidates, totalling 224 fits

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50 ..........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 31.1s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50 ..........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 31.0s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50 ..........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 31.7s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50 ..........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 32.2s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=100 .........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 27.1s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=100 .........

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 27.1s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=50, total= 26.6s

[CV] forst_reg__max_features=45, forst_reg__n_estimators=100 .........

ML之RF：利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

数据说明

输出结果

输出训练过程

导出推理结果

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

ML之RF：利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

利用Pipeline(客户年龄/职业/婚姻/教育/违约/余额/住房等)预测客户是否购买该银行的产品二分类(预测、推理)

数据说明

输出结果

输出训练过程

导出推理结果

热门文章

最新文章

相关课程

相关电子书

相关实验场景