l1约束的最小二乘学习-阿里云开发者社区

l1约束的最小二乘学习

2017-03-31 1152

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： ℓ1\ell_{1}Constrained Least Squares In sparse learning, ℓ1\ell_{1} constrained LS, also known as Lasso Regression, is a common learning method: minθJLS(θ)s.t.∥θ∥1≤R\min_{\theta} J_{LS}(\

$\ell_{1}$ Constrained Least Squares
In sparse learning, $\ell_{1}$ constrained LS, also known as Lasso Regression, is a common learning method:

min θ J L S (θ) s . t . ∥ θ ∥ 1 \leq R

$\min_{\theta} J_{LS}(\theta)\quad s.t. \quad \|\theta \|_{1}\leq R$ where

∥ θ ∥ 1 = \sum j = 1 b | θ j |

$\|\theta\|_{1}=\sum_{j=1}^{b}|\theta_{j}|$
Generally speaking, the solution of an

ℓ1 $\ell_{1}$ constrained LS is located on the axis, that is to say, there are several parameters

θj $\theta_{j}$ equal to zero (sparse).
Then how to solve it? Given the indifferentiable property of the absolute value at the origin, solving an

ℓ1 $\ell_{1}$ constrained LS is not so easy as solving the

ℓ2 $\ell_{2}$ constrained one. However, we can still apply Lagrange multiplier.

min θ J (θ), J (θ) = J L S (θ) + λ ∥ θ ∥ 1

$\min_{\theta}J(\theta),\quad J(\theta)=J_{LS}(\theta)+\lambda\|\theta\|_{1}$
Note that

| θ j | \leq θ 2 j 2 c j + c j 2, c j > 0

$|\theta_{j}|\leq\frac{\theta_{j}^{2}}{2c_{j}}+\frac{c_{j}}{2}, \quad c_{j}>0$ i.e. we can optimize the upper-bound of

J(θ) $J(\theta)$ . By iteration, we take the current solution

θ~j≠0 $\tilde{\theta}_{j}\neq 0$ as

cj $c_{j}$ so as to formulate the upper bound constraint:

| θ j | \leq θ 2 j 2 | θ ~ j | + | θ ~ j | 2

$|\theta_{j}|\leq\frac{\theta_{j}^{2}}{2|\tilde{\theta}_{j}|}+\frac{|\tilde{\theta}_{j}|}{2}$ If

θ~j=0 $\tilde{\theta}_{j}=0$ , we should take

θj=0 $\theta_{j}=0$ . When we use general inverse, the inequality above can be referred as:

| θ j | \leq | θ ~ j | † 2 θ 2 j + | θ ~ j | 2

$|\theta_{j}|\leq\frac{|\tilde{\theta}_{j}|^{\dagger}}{2}\theta_{j}^{2}+\frac{|\tilde{\theta}_{j}|}{2}$
Therefore, we can get the following

ℓ2 $\ell_{2}$ regularized constrained LS problem formulation:

θ^= arg min θ J ~ (θ), J ~ (θ) = J L S (θ) + λ 2 θ T Θ ~ † θ + C

$\hat{\theta}=\arg\min_{\theta}\tilde{J}(\theta),\quad \tilde{J}(\theta)=J_{LS}(\theta)+\frac{\lambda}{2}\theta^{T}\tilde{\Theta}^{\dagger}\theta+C$
where

Θ~=⎛⎝⎜⎜⎜|θ~1|⋱|θ~b|⎞⎠⎟⎟⎟ $\tilde{\Theta}= \begin{pmatrix} |\tilde{\theta}_{1}| & & \\ & \ddots & \\ & & |\tilde{\theta}_{b}| \end{pmatrix}$ and

C=∑bj=1|θ~j|/2 $C=\sum_{j=1}^{b} |\tilde{\theta}_{j}|/2$ are independent of

θ $\theta$ .

Take the parameterized linear model for example

f θ (x) = θ T ϕ (x)

$f_{\theta}(x)=\theta^{T}\phi(x)$
Then, by the use of Lagrange multiplier, we can get

θ^= (Φ T Φ + λ Θ ~ †) - 1 Φ y

$\hat{\theta}=\left(\Phi^{T}\Phi+\lambda\tilde{\Theta}^{\dagger}\right)^{-1}\Phi y$
Renew the estimation

θ~ $\tilde{\theta}$ as

θ~=θ^ $\tilde{\theta}=\hat{\theta}$ , go back to calculate the new

θ^ $\hat{\theta}$ until

θ^ $\hat{\theta}$ comes to the required precision.

For simplicity, the whole algorithm goes as follows:

Initialize $\theta_{0}$ and $i=1$ .
Calculate $\Theta_{i}$ using $\theta_{i-1}$ .
Estimate $\theta_{i}$ using $\Theta_{i}$ .
$i=i+1$ , go back to step 2.

An Example:

n=50; N=1000;
x=linspace(-3,3,n)'; X=linspace(-3,3,N)';
pix=pi*x;
y=sin(pix)./(pix)+0.1*x+0.2*rand(n,1);

hh=2*0.3^2; l=0.1; t0=randn(n,1); x2=x.^2;
k=exp(-(repmat(x2,1,n)+repmat(x2',n,1)-2*(x*x'))/hh);
k2=k^2; ky=k*y;
for o=1:1000
    t=(k2+l*pinv(diag(abs(t0))))\ky;
    if norm(t-t0)<0.001, break, end
    t0=t;
end
K=exp(-(repmat(X.^2,1,n)+repmat(x2',N,1)-2*X*x')/hh);
F=K*t;

figure(1); clf; hold on; axis([-2.8,2.8,-1,1.5]);
plot(X,F,'g-'); plot(x,y,'bp');

这里写图片描述

l1约束的最小二乘学习

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

l1约束的最小二乘学习

热门文章

最新文章

相关电子书