8.5 LinearSVR
8.5.1 LinearSVR类参数、属性和方法
类
class sklearn.svm.LinearSVR(*, epsilon=0.0, tol=0.0001, C=1.0, loss='epsilon_insensitive', fit_intercept=True, intercept_scaling=1.0, dual=True, verbose=0, random_state=None, max_iter=1000)
参数
属性 |
类型 |
解释 |
C |
float, default=1.0 |
正则化参数。正则化的强度与C成反比,必须是严格正的。 |
属性
属性 |
解释 |
coef_ |
ndarray of shape (n_features) if n_classes == 2 else (n_classes, n_features) 分配给特征的权重(原始问题中的系数)。这仅在线性内核的情况下可用。coef_是从遵循liblinear的内部内存布局的raw_coef_派生的只读属性。 |
intercept_ |
ndarray of shape (1) if n_classes == 2 else (n_classes) 决策函数中的常数。 |
n_iter_ |
int在所有类中运行的最大迭代次数。 |
方法
fit(X, y[, sample_weight]) |
根据给定的训练数据对模型进行拟合。 |
get_params([deep]) |
获取此估计器的参数。用 |
predict(X) |
线性模型预测。 |
score(X, y[, sample_weight]) |
返回预测的确定系数R2。 |
set_params(**params) |
设置此估计器的参数。 |
8.5.2 LinearSVR分析make_regression无噪音数据
def LinearSVR_for_make_regression(): myutil = util() X,y = make_regression(n_samples=100,n_features=1,n_informative=2,random_state=8) X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=8,test_size=0.3) clf = LinearSVR().fit(X,y) title = "make_regression LinearSVR()回归线(无噪音)" myutil.print_scores(clf,X_train,y_train,X_test,y_test,title) myutil.draw_line(X[:,0],y,clf,title) myutil.plot_learning_curve(LinearSVR(),X,y,title) myutil.show_pic(title)
输出
make_regression LineSVR()回归线(无噪音): 100.00% make_regression LineSVR()回归线(无噪音): 100.00%
结果非常好
8.5.3LinearSVR分析make_regression有噪音数据
#加入噪音 def LinearSVR_for_make_regression_add_noise(): myutil = util() X,y = make_regression(n_samples=100,n_features=1,n_informative=2,noise=50,random_state=8) X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=8,test_size=0.3) clf = LinearSVR().fit(X,y) title = "make_regression LinearSVR()回归线(有噪音)" myutil.print_scores(clf,X_train,y_train,X_test,y_test,title) myutil.draw_line(X[:,0],y,clf,title) myutil.plot_learning_curve(LinearSVR(),X,y,title) myutil.show_pic(title)
输出
make_regression LinearSVR()回归线(有噪音): 59.53% make_regression LinearSVR()回归线(有噪音): 51.15%
结果欠拟合
8.5.4 LinearSVR分析波士顿房价数据
def LinearSVR_for_boston(): warnings.filterwarnings("ignore") myutil = util() boston = datasets.load_boston() X,y = boston.data,boston.target X_train,X_test,y_train,y_test = train_test_split(X, y, random_state =8) scaler = StandardScaler() scaler.fit(X_train) X_scaler = scaler.transform(X) X_train_scaler = scaler.transform(X_train) X_test_scaler = scaler.transform(X_test) boston = LinearSVR() boston.fit(X_train_scaler,y_train) title = "LinearSVR for Boston" myutil.print_scores(boston,X_train_scaler,y_train,X_test_scaler,y_test,title) myutil.plot_learning_curve(LinearSVR(),X_scaler,y,title) myutil.show_pic(title)
输出
LinearSVR for Boston: 70.37% LinearSVR for Boston: 70.11%
欠拟合
8.5.5 LinearSVR分析糖尿病数据
#分析糖尿病数据 def LinearSVR_for_diabetes(): warnings.filterwarnings("ignore") myutil = util() diabetes = datasets.load_diabetes() X,y = diabetes.data,diabetes.target X_train,X_test,y_train,y_test = train_test_split(X, y, random_state =8) scaler = StandardScaler() scaler.fit(X_train) X_scaler = scaler.transform(X) X_train_scaler = scaler.transform(X_train) X_test_scaler = scaler.transform(X_test) svr = LinearSVR() svr.fit(X_train_scaler,y_train) title = "LinearSVR for Diabetes" myutil.print_scores(svr,X_train_scaler,y_train,X_test_scaler,y_test,title) myutil.plot_learning_curve(LinearSVR(),X_scaler,y,title) myutil.show_pic(title)
输出
LinearSVR for Diabetes: 36.11% LinearSVR for Diabetes: 33.77%
8.6总结
数据 |
模型 |
kernel |
gamma |
训练集得分 |
测试集得分 |
鸢尾花数据 |
SVC |
linear |
scale |
99.17% |
100.00% |
auto |
99.17% |
100.00% |
|||
0.1 |
99.17% |
100.00% |
|||
0.01 |
99.17% |
100.00% |
|||
0.001 |
99.17% |
100.00% |
|||
rbf |
scale |
96.67% |
96.67% |
||
auto |
97.50% |
96.67% |
|||
0.1 |
97.50% |
96.67% |
|||
0.01 |
95.00% |
86.67% |
|||
0.001 |
70.00% |
53.33% |
|||
sigmoid |
scale |
6.67% |
10.00% |
||
auto |
4.17% |
3.33% |
|||
0.1 |
5.83% |
6.67% |
|||
0.01 |
70.00% |
53.33% |
|||
0.001 |
70.00% |
53.33% |
|||
poly |
scale |
98.33% |
93.33% |
||
auto |
99.17% |
93.33% |
|||
0.1 |
97.50% |
93.33% |
|||
0.01 |
89.17% |
86.67% |
|||
0.001 |
57.50% |
50.00% |
|||
LinearSVC |
95.83% |
96.67% |
|||
红酒数据 |
SVC |
linear |
scale |
100.00% |
91.67% |
auto |
100.00% |
91.67% |
|||
0.1 |
100.00% |
91.67% |
|||
0.01 |
100.00% |
91.67% |
|||
0.001 |
100.00% |
91.67% |
|||
rbf |
scale |
71.13% |
66.67% |
||
auto |
100.00% |
50.00%(过拟合) |
|||
0.1 |
100.00% |
50.00%(过拟合) |
|||
0.01 |
99.30% |
69.44%(过拟合) |
|||
0.001 |
83.80% |
72.22% |
|||
sigmoid |
scale |
18.31% |
22.22% |
||
auto |
39.44% |
41.67% |
|||
0.1 |
39.44% |
41.67% |
|||
0.01 |
39.44% |
41.67% |
|||
0.001 |
39.44% |
41.67% |
|||
poly |
scale |
65.49% |
80.56% |
||
auto |
100.00% |
91.67% |
|||
0.1 |
100.00% |
91.67% |
|||
0.01 |
100.00% |
91.67% |
|||
0.001 |
100.00% |
91.67% |
|||
LinearSVC |
90.14% |
86.11% |
|||
乳腺癌数据 |
SVC |
linear |
scale |
94.51% |
91.23% |
auto |
94.51% |
91.23% |
|||
0.1 |
94.51% |
91.23% |
|||
0.01 |
94.51% |
91.23% |
|||
0.001 |
94.51% |
91.23% |
|||
rbf |
scale |
91.87% |
91.23% |
||
auto |
100.00% |
56.14%(过拟合) |
|||
0.1 |
100.00% |
56.14%(过拟合) |
|||
0.01 |
100.00% |
55.26%(过拟合) |
|||
0.001 |
98.02% |
89.47% |
|||
sigmoid |
scale |
48.79% |
38.60% |
||
auto |
64.40% |
56.14% |
|||
0.1 |
64.40% |
56.14% |
|||
0.01 |
64.40% |
56.14% |
|||
0.001 |
64.40% |
56.14% |
|||
poly |
scale |
90.77% |
91.23% |
||
auto |
48.35% |
51.75% |
|||
0.1 |
35.38% |
34.21% |
|||
0.01 |
76.70% |
79.82% |
|||
0.001 |
40.44% |
42.11% |
|||
LinearSVR |
93.19% |
92.11% |
|||
波士顿房价数据 |
SVC |
linear |
scale |
70.56% |
69.84% |
auto |
70.56% |
69.84% |
|||
0.1 |
70.56% |
69.84% |
|||
0.01 |
70.56% |
69.84% |
|||
0.001 |
70.56% |
69.84% |
|||
rbf |
scale |
66.50% |
69.46% |
||
auto |
66.50% |
69.46% |
|||
0.1 |
64.18% |
66.96% |
|||
0.01 |
56.81% |
60.37% |
|||
0.001 |
22.75% |
24.48% |
|||
sigmoid |
scale |
56.44% |
63.41% |
||
auto |
56.44% |
63.41% |
|||
0.1 |
35.92% |
39.85% |
|||
0.01 |
49.05% |
52.26% |
|||
0.001 |
13.84% |
14.82% |
|||
poly |
scale |
68.60% |
62.33% |
||
auto |
68.60% |
62.33% |
|||
0.1 |
76.02% |
63.72% |
|||
0.01 |
1.59% |
1.03% |
|||
0.001 |
-2.36% |
-2.66% |
|||
LinearSVC |
70.37% |
70.11% |
|||
糖尿病 |
SVR |
linear |
scale |
-0.89% |
0.35% |
auto |
-0.89% |
0.35% |
|||
0.1 |
-0.89% |
0.35% |
|||
0.01 |
-0.89% |
0.35% |
|||
0.001 |
-0.89% |
0.35% |
|||
rbf |
scale |
18.30% |
15.14% |
||
auto |
-2.94% |
-1.77% |
|||
0.1 |
-2.94% |
-1.77% |
|||
0.01 |
-3.18% |
-2.07% |
|||
0.001 |
-3.20% |
-2.10% |
|||
sigmoid |
scale |
37.86% |
38.36% |
||
auto |
-3.07% |
-1.94% |
|||
0.1 |
-3.07% |
-1.94% |
|||
0.01 |
-3.19% |
-2.09% |
|||
0.001 |
-3.20% |
-2.10% |
|||
poly |
scale |
23.87% |
31.86% |
||
auto |
-3.20% |
-2.10% |
|||
0.1 |
-3.20% |
-2.10% |
|||
0.01 |
-3.20% |
-2.10% |
|||
0.001 |
-3.20% |
-2.10% |
|||
LinearSVR |
36.11% |
33.77% |
结论:
- SVR不适合分析糖尿病数据
- SVC分析红酒和乳腺癌数据,某些情况出现过拟合现象
- SVC分析鸢尾花,红酒,乳腺癌相对比较好