sklearn：sklearn.feature_selection的SelectFromModel函数的简介、使用方法之详细攻略（二）-阿里云开发者社区

sklearn：sklearn.feature_selection的SelectFromModel函数的简介、使用方法之详细攻略（二）

2021-11-06 507

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： sklearn：sklearn.feature_selection的SelectFromModel函数的简介、使用方法之详细攻略

2、L1-based feature selection

>>> from sklearn.svm import LinearSVC

>>> from sklearn.datasets import load_iris

>>> from sklearn.feature_selection import SelectFromModel

>>> X, y = load_iris(return_X_y=True)

>>> X.shape

(150, 4)

>>> lsvc = LinearSVC(C=0.01, penalty="l1", dual=False).fit(X, y)

>>> model = SelectFromModel(lsvc, prefit=True)

>>> X_new = model.transform(X)

>>> X_new.shape

(150, 3)

3、Tree-based feature selection

>>> from sklearn.ensemble import ExtraTreesClassifier

>>> from sklearn.datasets import load_iris

>>> from sklearn.feature_selection import SelectFromModel

>>> X, y = load_iris(return_X_y=True)

>>> X.shape

(150, 4)

>>> clf = ExtraTreesClassifier(n_estimators=50)

>>> clf = clf.fit(X, y)

>>> clf.feature_importances_

array([ 0.04..., 0.05..., 0.4..., 0.4...])

>>> model = SelectFromModel(clf, prefit=True)

>>> X_new = model.transform(X)

>>> X_new.shape

(150, 2)

SelectFromModel函数的使用方法

1、SelectFromModel的原生代码

class SelectFromModel Found at: sklearn.feature_selection.from_model

class SelectFromModel(BaseEstimator, SelectorMixin, MetaEstimatorMixin):

"""Meta-transformer for selecting features based on importance weights.

.. versionadded:: 0.17

Parameters

----------

estimator : object

The base estimator from which the transformer is built.

This can be both a fitted (if ``prefit`` is set to True)

or a non-fitted estimator. The estimator must have either a

``feature_importances_`` or ``coef_`` attribute after fitting.

threshold : string, float, optional default None

The threshold value to use for feature selection. Features whose

importance is greater or equal are kept while the others are

discarded. If "median" (resp. "mean"), then the ``threshold`` value is

the median (resp. the mean) of the feature importances. A scaling

factor (e.g., "1.25*mean") may also be used. If None and if the

estimator has a parameter penalty set to l1, either explicitly

or implicitly (e.g, Lasso), the threshold used is 1e-5.

Otherwise, "mean" is used by default.

prefit : bool, default False

Whether a prefit model is expected to be passed into the constructor

directly or not. If True, ``transform`` must be called directly

and SelectFromModel cannot be used with ``cross_val_score``,

``GridSearchCV`` and similar utilities that clone the estimator.

Otherwise train the model using ``fit`` and then ``transform`` to do

feature selection.

norm_order : non-zero int, inf, -inf, default 1

Order of the norm used to filter the vectors of coefficients below

``threshold`` in the case where the ``coef_`` attribute of the

estimator is of dimension 2.

Attributes

----------

estimator_ : an estimator

The base estimator from which the transformer is built.

This is stored only when a non-fitted estimator is passed to the

``SelectFromModel``, i.e when prefit is False.

threshold_ : float

The threshold value used for feature selection.

"""

def __init__(self, estimator, threshold=None, prefit=False,

norm_order=1):

self.estimator = estimator

self.threshold = threshold

self.prefit = prefit

self.norm_order = norm_order

def _get_support_mask(self):

# SelectFromModel can directly call on transform.

if self.prefit:

estimator = self.estimator

elif hasattr(self, 'estimator_'):

estimator = self.estimator_

else:

raise ValueError(

'Either fit SelectFromModel before transform or set "prefit='

'True" and pass a fitted estimator to the constructor.')

scores = _get_feature_importances(estimator, self.norm_order)

threshold = _calculate_threshold(estimator, scores, self.threshold)

return scores >= threshold

def fit(self, X, y=None, **fit_params):

"""Fit the SelectFromModel meta-transformer.

Parameters

----------

X : array-like of shape (n_samples, n_features)

The training input samples.

y : array-like, shape (n_samples,)

The target values (integers that correspond to classes in

classification, real numbers in regression).

**fit_params : Other estimator specific parameters

Returns

-------

self : object

Returns self.

"""

if self.prefit:

raise NotFittedError(

"Since 'prefit=True', call transform directly")

self.estimator_ = clone(self.estimator)

self.estimator_.fit(X, y, **fit_params)

return self

@property

def threshold_(self):

scores = _get_feature_importances(self.estimator_, self.norm_order)

return _calculate_threshold(self.estimator, scores, self.threshold)

@if_delegate_has_method('estimator')

def partial_fit(self, X, y=None, **fit_params):

"""Fit the SelectFromModel meta-transformer only once.

Parameters

----------

X : array-like of shape (n_samples, n_features)

The training input samples.

y : array-like, shape (n_samples,)

The target values (integers that correspond to classes in

classification, real numbers in regression).

**fit_params : Other estimator specific parameters

Returns

-------

self : object

Returns self.

"""

if self.prefit:

raise NotFittedError(

"Since 'prefit=True', call transform directly")

if not hasattr(self, "estimator_"):

self.estimator_ = clone(self.estimator)

self.estimator_.partial_fit(X, y, **fit_params)

return self

sklearn：sklearn.feature_selection的SelectFromModel函数的简介、使用方法之详细攻略（二）

SelectFromModel函数的使用方法

热门文章

最新文章

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

sklearn：sklearn.feature_selection的SelectFromModel函数的简介、使用方法之详细攻略（二）

SelectFromModel函数的使用方法

热门文章

最新文章

相关电子书