8.4 SVR
8.4.1 SVR类参数、属性和方法
类
class sklearn.svm.SVR(*, kernel='rbf', degree=3, gamma='scale', coef0=0.0, tol=0.001, C=1.0, epsilon=0.1, shrinking=True, cache_size=200, verbose=False, max_iter=- 1)
参数
属性 |
类型 |
解释 |
C |
float, default=1.0 |
正则化参数。正则化的强度与C成反比,必须是严格正的。惩罚是l2惩罚的平方。 |
kernel |
{'linear', 'poly', 'rbf', 'sigmoid', 'precomputed'}, default='rbf' |
指定要在算法中使用的内核类型。它必须是'linear''poly',' sigmoid','precomputed'或'callable'之一。如果没有给出,则使用'rbf'。如果给定了一个可调用函数,它将用于预计算内核矩阵。 |
gamma |
{'scale', 'auto' or float, default='scale' |
'rbf','poly'和'sigmoid'的核系数。如果gamma='scale '(默认值)被传递,那么它使用1 / (n_features * X.var())作为gamma的值,如果'auto',则使用1 / n_features。 |
属性
属性 |
解释 |
class_weight_ |
ndarray of shape (n_classes,) |
coef_ |
ndarray of shape (1, n_features) |
dual_coef_ |
ndarray of shape (1, n_SV) |
fit_status_ |
int |
intercept_ |
ndarray of shape (1,) |
n_support_ |
ndarray of shape (n_classes,), dtype=int32 |
shape_fit_ |
tuple of int of shape (n_dimensions_of_X,) |
support_ |
ndarray of shape (n_SV,) |
support_vectors_ |
ndarray of shape (n_SV, n_features) |
方法
fit(X, y[, sample_weight]) |
根据给定的训练数据拟合SVM模型。 |
get_params([deep]) |
获取此估计器的参数。 |
predict(X) |
对X中的样本执行回归。 |
score(X, y[, sample_weight]) |
返回预测的确定系数R2。 |
set_params(**params) |
设置此估计器的参数。 |
8.4.2 分析make_regression无噪音
def SVR_for_make_regression(): myutil = util() X,y = make_regression(n_samples=100,n_features=1,n_informative=2,random_state=8) X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=8,test_size=0.3) clf = SVR().fit(X,y) title = "make_regression SVR ()回归线(无噪音)" myutil.print_scores(clf,X_train,y_train,X_test,y_test,title) myutil.draw_line(X[:,0],y,clf,title) myutil.plot_learning_curve(SVR(),X,y,title) myutil.show_pic(title)
输出
make_regression LinearRegression()回归线(无噪音): 33.56% make_regression LinearRegression()回归线(无噪音): 41.08%
结果非常糟糕
8.4.3 分析make_regression有噪音
def SVR_for_make_regression_add_noise(): myutil = util() X,y = make_regression(n_samples=100,n_features=1,n_informative=2,noise=50,random_state=8) X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=8,test_size=0.3) clf = SVR().fit(X,y) title = "make_regression SVR ()回归线(有噪音)" myutil.print_scores(clf,X_train,y_train,X_test,y_test,title) myutil.draw_line(X[:,0],y,clf,title) myutil.plot_learning_curve(SVR(),X,y,title) myutil.show_pic(title)
输出
make_regression LinearRegression()回归线(有噪音): 18.74% make_regression LinearRegression()回归线(有噪音): 18.98%
结果更加糟糕
8.4.4 SVR分析波士顿房价数据
def SVR_for_boston(): myutil = util() boston = datasets.load_boston() X,y = boston.data,boston.target X_train,X_test,y_train,y_test = train_test_split(X, y, random_state =8) for kernel in ['linear','rbf','sigmoid','poly']: svr = SVR(kernel=kernel) svr.fit(X_train,y_train) title = "SVR kernel=:"+kernel+"(预处理前)" myutil.print_scores(svr,X_train,y_train,X_test,y_test,title) scaler = StandardScaler() scaler.fit(X_train) X_train_scaler = scaler.transform(X_train) X_test_scaler = scaler.transform(X_test) for kernel in ['linear','rbf','sigmoid','poly']: svr = SVR(kernel=kernel) svr.fit(X_train_scaler,y_train) title = "SVR kernel=:"+kernel+"(预处理后)" myutil.print_scores(svr,X_train_scaler,y_train,X_test_scaler,y_test,title)
输出
SVR kernel=:linear(预处理前): 70.88% SVR kernel=:linear(预处理前): 69.64% SVR kernel=:rbf(预处理前): 19.20% SVR kernel=:rbf(预处理前): 22.23% SVR kernel=:sigmoid(预处理前): 5.94% SVR kernel=:sigmoid(预处理前): 7.53% SVR kernel=:poly(预处理前): 19.50% SVR kernel=:poly(预处理前): 20.70% SVR kernel=:linear(预处理后): 70.56% SVR kernel=:linear(预处理后): 69.84% SVR kernel=:rbf(预处理后): 66.50% SVR kernel=:rbf(预处理后): 69.46% SVR kernel=:sigmoid(预处理后): 56.44% SVR kernel=:sigmoid(预处理后): 63.41% SVR kernel=:poly(预处理后): 68.60% SVR kernel=:poly(预处理后): 62.33%
kernel |
linear |
rbf |
sigmoid |
poly |
预处理前 |
70.88%/69.64% |
19.20%/22.23% |
5.94%/7.53% |
19.50%/20.70% |
预处理后 |
70.56%/69.84% |
66.50%/69.46% |
56.44%/63.41% |
68.60%/62.33% |
可见,除了linear,预处理后的得分远远高于处理前。scaler = StandardScaler()我们以前介绍过。
def SVR_for_boston_for_gamma(): myutil = util() boston = datasets.load_boston() X,y = boston.data,boston.target X_train,X_test,y_train,y_test = train_test_split(X, y, random_state =8) scaler = StandardScaler() scaler.fit(X_train) X_train_scaler = scaler.transform(X_train) X_test_scaler = scaler.transform(X_test) for kernel in ['linear','rbf','sigmoid','poly']: for gamma in ['scale', 'auto',0.1,0.01,0.001]: svr = SVR(kernel=kernel) svr.fit(X_train_scaler,y_train) title = "SVR kernel=:"+kernel+",gamma="+str(gamma) myutil.print_scores(svr,X_train_scaler,y_train,X_test_scaler,y_test,title) #myutil.plot_learning_curve(SVR(kernel=kernel),X,y,title) #myutil.show_pic(title)输出
由于画图很花时间,这里不画了
输出
SVR kernel=:linear,gamma=scale: 70.56% SVR kernel=:linear,gamma=scale: 69.84% SVR kernel=:linear,gamma=auto: 70.56% SVR kernel=:linear,gamma=auto: 69.84% SVR kernel=:linear,gamma=0.1: 70.56% SVR kernel=:linear,gamma=0.1: 69.84% SVR kernel=:linear,gamma=0.01: 70.56% SVR kernel=:linear,gamma=0.01: 69.84% SVR kernel=:linear,gamma=0.001: 70.56% SVR kernel=:linear,gamma=0.01: 69.84% SVR kernel=:linear,gamma=0.001: 70.56% SVR kernel=:linear,gamma=0.001: 69.84% SVR kernel=:rbf,gamma=scale: 66.50% SVR kernel=:rbf,gamma=scale: 69.46% SVR kernel=:rbf,gamma=auto: 66.50% SVR kernel=:rbf,gamma=auto: 69.46% SVR kernel=:rbf,gamma=0.1: 64.18% SVR kernel=:rbf,gamma=0.1: 66.96% SVR kernel=:rbf,gamma=0.01: 56.81% SVR kernel=:rbf,gamma=0.01: 60.37% SVR kernel=:rbf,gamma=0.001: 22.75% SVR kernel=:rbf,gamma=0.001: 24.48% SVR kernel=:sigmoid,gamma=scale: 56.44% SVR kernel=:sigmoid,gamma=scale: 63.41% SVR kernel=:sigmoid,gamma=auto: 56.44% SVR kernel=:sigmoid,gamma=auto: 63.41% SVR kernel=:sigmoid,gamma=0.1: 35.92% SVR kernel=:sigmoid,gamma=0.1: 39.85% SVR kernel=:sigmoid,gamma=0.01: 49.05% SVR kernel=:sigmoid,gamma=0.01: 52.26% SVR kernel=:sigmoid,gamma=0.001: 13.84% SVR kernel=:sigmoid,gamma=0.001: 14.82% SVR kernel=:poly,gamma=scale: 68.60% SVR kernel=:poly,gamma=scale: 62.33% SVR kernel=:poly,gamma=auto: 68.60% SVR kernel=:poly,gamma=auto: 62.33% SVR kernel=:poly,gamma=0.1: 76.02% SVR kernel=:poly,gamma=0.1: 63.72% SVR kernel=:poly,gamma=0.01: 1.59% SVR kernel=:poly,gamma=0.01: 1.03% SVR kernel=:poly,gamma=0.001: -2.36% SVR kernel=:poly,gamma=0.001: -2.66%
8.4.5 SVR分析糖尿病数据
#分析糖尿病数据 def SVR_for_diabetes_for_gamma(): myutil = util() diabetes = datasets.load_diabetes() X,y = diabetes.data,diabetes.target X_train,X_test,y_train,y_test = train_test_split(X, y, random_state =8) for kernel in ['linear','rbf','sigmoid','poly']: for gamma in ['scale', 'auto',0.1,0.01,0.001]: svr = SVR(kernel=kernel,gamma=gamma) svr.fit(X_train,y_train) title = "SVR kernel=:"+kernel+",gamma="+str(gamma) myutil.print_scores(svr,X_train,y_train,X_test,y_test,title)
输出
SVR kernel=:linear,gamma=scale: -0.89% SVR kernel=:linear,gamma=scale: 0.35% SVR kernel=:linear,gamma=auto: -0.89% SVR kernel=:linear,gamma=auto: 0.35% SVR kernel=:linear,gamma=0.1: -0.89% SVR kernel=:linear,gamma=0.1: 0.35% SVR kernel=:linear,gamma=0.01: -0.89% SVR kernel=:linear,gamma=0.01: 0.35% SVR kernel=:linear,gamma=0.001: -0.89% SVR kernel=:linear,gamma=0.001: 0.35% SVR kernel=:rbf,gamma=scale: 18.30% SVR kernel=:rbf,gamma=scale: 15.14% SVR kernel=:rbf,gamma=auto: -2.94% SVR kernel=:rbf,gamma=auto: -1.77% SVR kernel=:rbf,gamma=0.1: -2.94% SVR kernel=:rbf,gamma=0.1: -1.77% SVR kernel=:rbf,gamma=0.01: -3.18% SVR kernel=:rbf,gamma=0.01: -2.07% SVR kernel=:rbf,gamma=0.001: -3.20% SVR kernel=:rbf,gamma=0.001: -2.10% SVR kernel=:sigmoid,gamma=scale: 37.86% SVR kernel=:sigmoid,gamma=scale: 38.36% SVR kernel=:sigmoid,gamma=auto: -3.07% SVR kernel=:sigmoid,gamma=auto: -1.94% SVR kernel=:sigmoid,gamma=0.1: -3.07% SVR kernel=:sigmoid,gamma=0.1: -1.94% SVR kernel=:sigmoid,gamma=0.01: -3.19% SVR kernel=:sigmoid,gamma=0.01: -2.09% SVR kernel=:sigmoid,gamma=0.001: -3.20% SVR kernel=:sigmoid,gamma=0.001: -2.10% SVR kernel=:poly,gamma=scale: 23.87% SVR kernel=:poly,gamma=scale: 31.86% SVR kernel=:poly,gamma=auto: -3.20% SVR kernel=:poly,gamma=auto: -2.10% SVR kernel=:poly,gamma=0.1: -3.20% SVR kernel=:poly,gamma=0.1: -2.10% SVR kernel=:poly,gamma=0.01: -3.20% SVR kernel=:poly,gamma=0.01: -2.10% SVR kernel=:poly,gamma=0.001: -3.20% SVR kernel=:poly,gamma=0.001: -2.10%
分数很低,即使进行预处理,也没有用