开发者社区> 问答> 正文

从df抛出的一组特征值的真值是不明确的

我有大量的数据集的特点。 我过滤掉了过滤器,并将所选特性的名称存储在4个数组中。 我想删除那些没有被选中的功能

df = pd.read_excel("Anonymizeddataset.xlsx")
df = df.fillna(0)

# 4 arrays
features_selected_with_nan_value
KBest_select_feature
features_selected_with_mean_value
laso_selected_features

def drop_features(features):
    for index, row in df.iterrows():
        for i in range(len(features)):
            if row != features[i]:
                df_with_selected_features = df.drop([row], axis = 1, inplace = True)
    return df_with_selected_features

但它抛出这个错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

数据集

    Target  Predictor 1 Predictor 2 Predictor 3 Predictor 4 Predictor 5 Predictor 6 Predictor 7 Predictor 8 Predictor 9 ... Predictor 1065  Predictor 1066  Predictor 1067  Predictor 1068  Predictor 1069  Predictor 1070  Predictor 1071  Predictor 1072  Predictor 1073  Predictor 1074
0   5704.7  98.013498   98.380881   66.012913   21.447560   0.0 0.0 0.0 57.549196   12  ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1   3200.0  51.224883   98.380881   70.885204   21.447560   0.0 0.0 0.0 57.549196   13  ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2   6487.9  44.563802   98.380881   85.757141   21.447560   0.0 0.0 0.0 57.549196   13  ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3   1278.3  65.039616   98.380881   18.380713   87.745614   0.0 0.0 0.0 57.549196   13  ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4   1368.5  1.905928    98.380881   96.797313   87.745614   0.0 0.0 0.0 57.549196   13  ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 rows × 1075 columns

features_selected_with_nan_value数组

['Predictor 387', 'Predictor 381', 'Predictor 383', 'Predictor 376', 'Predictor 28', 'Predictor 35', 'Predictor 4', 'Predictor 37', 'Predictor 34', 'Predictor 19', 'Predictor 16', 'Predictor 17', 'Predictor 25', 'Predictor 880', 'Predictor 856', 'Predictor 849', 'Predictor 851', 'Predictor 852', 'Predictor 857', 'Predictor 853', 'Predictor 855', 'Predictor 850', 'Predictor 854', 'Predictor 40', 'Predictor 881', 'Predictor 882', 'Predictor 883', 'Predictor 884', 'Predictor 1015', 'Predictor 487', 'Predictor 738', 'Predictor 476', 'Predictor 473', 'Predictor 749', 'Predictor 604', 'Predictor 607', 'Predictor 618', 'Predictor 848', 'Predictor 1014', 'Predictor 1007', 'Predictor 1012', 'Predictor 979', 'Predictor 344', 'Predictor 345', 'Predictor 356', 'Predictor 392', 'Predictor 858', 'Predictor 859', 'Predictor 860', 'Predictor 861', 'Predictor 879', 'Predictor 862', 'Predictor 863', 'Predictor 980', 'Predictor 864', 'Predictor 878', 'Predictor 865', 'Predictor 877', 'Predictor 866', 'Predictor 867', 'Predictor 869', 'Predictor 870', 'Predictor 871', 'Predictor 872', 'Predictor 873', 'Predictor 874', 'Predictor 876', 'Predictor 735', 'Predictor 981', 'Predictor 982', 'Predictor 983', 'Predictor 1011', 'Predictor 1010', 'Predictor 1009', 'Predictor 1008', 'Predictor 875', 'Predictor 1006', 'Predictor 1005', 'Predictor 1004', 'Predictor 1003', 'Predictor 1002', 'Predictor 1001', 'Predictor 1000', 'Predictor 342', 'Predictor 998', 'Predictor 997', 'Predictor 996', 'Predictor 995', 'Predictor 994', 'Predictor 992', 'Predictor 991', 'Predictor 990', 'Predictor 989', 'Predictor 988', 'Predictor 987', 'Predictor 986', 'Predictor 985', 'Predictor 984', 'Predictor 1013', 'Predictor 993']

我做错了什么? 问题来源StackOverflow 地址:/questions/59380060/drop-set-of-features-from-df-throw-an-error-of-the-truth-value-of-a-series-is-am

展开
收起
kun坤 2019-12-29 21:31:26 377 0
1 条回答
写回答
取消 提交回答
  • 如果我没有理解错你的问题,你可以这样做:

    df = pd.read_excel("Anonymizeddataset.xlsx")
    df = df.fillna(0)
    
    list_columns = features_selected_with_nan_value + KBest_select_feature +
    features_selected_with_mean_value + laso_selected_features
    
    df = df[list_columns]
    
    2019-12-29 21:31:34
    赞同 展开评论 打赏
问答分类:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
重新定义计算的边界 立即下载
低代码开发师(初级)实战教程 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载