利用网格搜索对超参数进行调节
模型构建
'''
'kernel': 核函数
'C': SVR的正则化因子,
'gamma': 'rbf', 'poly' and 'sigmoid'核函数的系数,影响模型性能
'''
parameters = {
'kernel': ['linear', 'rbf'],
'C': [0.1, 0.5,0.9,1,5],
'gamma': [0.001,0.01,0.1,1]
}
使用网格搜索,以及交叉验证
model = GridSearchCV(SVR(), param_grid=parameters, cv=3)
model.fit(x_train, y_train)
输出:
GridSearchCV(cv=3, error_score='raise',
estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto',
kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False),
fit_params={}, iid=True, n_jobs=1,
param_grid={'kernel': ['linear', 'rbf'], 'C': [0.1, 0.5, 0.9, 1, 5], 'gamma': [0.001, 0.01, 0.1, 1]},
pre_dispatch='2*n_jobs', refit=True, return_train_score=True,
scoring=None, verbose=0)
获取最优参数
print ("最优参数列表:", model.bestparams)
print ("最优模型:", model.bestestimator)
print ("最优R2值:", model.bestscore)
输出:
最优参数列表: {'C': 5, 'gamma': 0.1, 'kernel': 'rbf'}
最优模型: SVR(C=5, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma=0.1,
kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)
最优R2值: 0.797481706635164
可视化
ln_x_test = range(len(x_test))
y_predict = model.predict(x_test)
设置画布
plt.figure(figsize=(16,8), facecolor='w')
用红实线画图
plt.plot(ln_x_test, y_test, 'r-', lw=2, label=u'真实值')
用绿实线画图
plt.plot(ln_x_test, y_predict, 'g-', lw = 3, label=u'SVR算法估计值,$R^2$=%.3f' % (model.bestscore))
图形显示
plt.legend(loc = 'upper left')
plt.grid(True)
plt.title(u"波士顿房屋价格预测(SVM)")
plt.xlim(0, 101)
plt.show()