目录
Standardization&Scaling、 Normalization简介
1、Standardization, or mean removal and variance scaling
1.1、Scaling features to a range
1.2、Scaling sparse data
1.3、Scaling data with outliers
1.4、Scaling vs Whitening
1.5、Centering kernel matrices
2、Normalization
Standardization&Scaling、 Normalization简介
参考文章:https://scikit-learn.org/stable/modules/preprocessing.html
1、Standardization, or mean removal and variance scaling 标准化,或均值去除和方差标度
from sklearn import preprocessing import numpy as np X_train = np.array([[ 1., -1., 2.], [ 2., 0., 0.], [ 0., 1., -1.]]) X_scaled = preprocessing.scale(X_train) print(X_scaled ) Scaled data has zero mean and unit variance: X_scaled.mean(axis=0) X_scaled.std(axis=0)
scaler = preprocessing.StandardScaler().fit(X_train) print(scaler) print(scaler.mean_) print(scaler.scale_) print(scaler.transform(X_train)) X_test = [[-1., 1., 0.]] scaler.transform(X_test)
1.1、Scaling features to a range 缩放功能到一个范围
1.2、Scaling sparse data 缩放稀疏数据
1.3、Scaling data with outliers 用离群值对数据进行缩放
1.4、Scaling vs Whitening 缩放比例与白化
1.5、Centering kernel matrices 中心核矩阵
2、Normalization 归一化
X = [[ 1., -1., 2.], [ 2., 0., 0.], [ 0., 1., -1.]] X_normalized = preprocessing.normalize(X, norm='l2') print(X_normalized) normalizer = preprocessing.Normalizer().fit(X) # fit does nothing print(normalizer) normalizer.transform(X) normalizer.transform([[-1., 1., 0.]])