l2约束的最小二乘学习法

简介: ℓ2\ell_{2}constrained least squares In the simple least squares, noisy samples may lead to overfitting learning output. Therefore, it is rational to constrain the space of parameters. We

2constrained least squares
In the simple least squares, noisy samples may lead to overfitting learning output. Therefore, it is rational to constrain the space of parameters.
We will focus on the simplest case - 2 constrained least squares in this note, i.e.

minθJLS(θ),s.t.θ2R

In order to the solve the aforesaid optimal problem, using Lagrangian dual problem, we can utilize the following optimal problem:
maxλminθ[JLS(θ)+λ2(θ2R)]

A brief review of Lagrangian dual problem can be found in the bottom of the note.
However, it is not necessary to define a R to constrained λ, then we can solve the estimated θ as
θ^=argminθ[JLS(θ)+λ2θ2]

where the first term JLS(θ) represents the fitting level which is combined by λ2θ2 to prevent overfitting to some degree.
Taking the partial difference of [JLS(θ)+λ2θ2 and seting it to be zero, we get the solution
θ^=(ΦTΦ+λI)1ΦTy

A more general method involves a regularizer G:
minθJLS(θ)s.t.θTGθR
and
θ^=(ΦTΦ+λG)1ΦTy

Ex: The Guassian Kernal Model

fθ(x)=j=1nθjK(x,xj),K(x,c)=exp(xc22h2)
The MATLAB codes go as follows:
n=50; N=1000;
x=linspace(-3,3,n)'; X=linspace(-3,3,n)';
pix=pi*x; y=sin(pix)./(pix)+0.1*x+0.2*randn(n,1);

x2=x.^2; X2=X.^2; hh=2*0.3^2; l=0.1;
k=exp(-(repmat(x2,1,n)+repmat(x2',n,1)-2*x*x')/hh);
K=exp(-(repmat(X2,1,n)+repmat(x2',N,1)-2*X*x')/hh);
t1=k\y; F1=K*t1; t2=(k^2+l*eye(n))\(k*y); F2=K*t2;

figure(1); clf; hold on; axis([-2.8,2.8,-1,1.5]);
plot(X,F1,'g-');plot(X,F2,'r--');plot(x,y,'bo');
legend('LS','L2-Constrained LS');

Appendix Lagrangian Dual Problem
Given differentiable convex function f:RdR and g:RdRp, a optimal problem can be formulated as

mintf(t),s.t.g(t)0

Let
λ=(λ1,,λp)T
be the Lagrangian multiplier.
Let
L(t,λ)=f(t)+λTg(t)
be the Lagrangian function.
Then the aforementioned optimal problem can be defined as
maxλinftL(t,λ),s.t.λ0

That’s the Lagrangian dual problem. We can get the same t by solving it.
相关文章
|
11月前
|
算法 数据挖掘 API
贝叶斯统计在Python数据分析中的高级技术点:贝叶斯推断、概率编程和马尔科夫链蒙特卡洛
贝叶斯统计在Python数据分析中的高级技术点:贝叶斯推断、概率编程和马尔科夫链蒙特卡洛
111 1
贝叶斯统计在Python数据分析中的高级技术点:贝叶斯推断、概率编程和马尔科夫链蒙特卡洛
15 贝叶斯方法
15 贝叶斯方法
48 0
|
机器学习/深度学习 决策智能
约束最优化方法 (四) 乘子法
约束最优化方法 (四) 乘子法
215 0
|
机器学习/深度学习
等约束二次规划中的特征分解研究(Matlab代码实现)
等约束二次规划中的特征分解研究(Matlab代码实现)
|
存储 算法
PDE优化|逆问题中偏微分方程约束优化的惩罚方法(Matlab代码实现)
PDE优化|逆问题中偏微分方程约束优化的惩罚方法(Matlab代码实现)
213 0
|
算法 决策智能
通用的改进遗传算法求解带约束的优化问题(MATLAB代码)
通用的改进遗传算法求解带约束的优化问题(MATLAB代码)
652 0
|
机器学习/深度学习 人工智能
【机器学习】SVM中的约束优化问题证明
【机器学习】SVM中的约束优化问题证明
140 0
【机器学习】SVM中的约束优化问题证明
|
BI
统计学习--最大似然和贝叶斯估计的联系
概率是已知模型和参数,推数据;统计是已知数据,推模型和参数
122 0
统计学习--最大似然和贝叶斯估计的联系
凸优化理论基础3——凸集和凸锥重要例子
凸优化理论基础3——凸集和凸锥重要例子
946 0
凸优化理论基础3——凸集和凸锥重要例子