Lagrange Multipliers are used to solve the optimal value of multivariate functions under a group of constraints. By lagrange multipliers, we can convert an optimal problem with d variables and k constraints to one with d+k variables without constraint.
- Equality Constraint
Suppose x∈Rd, we would like to solve some optimal value x∗ s. t. minxf(x) and g(x)=0, i.e.
minxf(x),s.t.g(x)=0
For simplicity, we omit the geometric explanation of the optimal problem. Define the Lagrange multiplier λ≠0 s.t.
∇f(x)+λ∇g(x)=0
Define the corresponding Lagrange funcntion as
L(x,λ)=f(x)+λg(x)
So
∇xL(x,λ)=∇f(x)+λ∇g(x)=0
∇λL(x,λ)=g(x)=0
Obviously, the original optimal problem is converted to the new optimal problem with no constraint. - Inequality Constraint
In the inequality constraint case, the optimal problem can be defined as
minxf(x),s.t.g(x)≤0
When g(x)<0, the constraint g(x)≤0 makes no sense which means that we can take ∇f(x)=0 to solve the optimal problem. In addition, when g(x)=0, the inequality constraint degrades to the equality constraint.
In summary, the KKT (Karush-Kuhn-Tucker) conditions are always satisfied:
⎧⎩⎨⎪⎪g(x)≤0;λ≥0;λg(x)=0
With efficiency concerned, we show the solution of the optimal problem together with next problem. Go on. - Multi-constraints
Consider an optimal problem with m equality constraints and n inequality constraints. In addition there is a feasible region D⊂Rd:
minxs.t.f(x)hi(x)=0,i=1,2,…,m,gj(x)≤0,j=1,2,…,n
Define the lagrange function as
L(x,λ,μ)=f(x)+∑i=1mλihi(x)+∑j=1nμjgj(x)
Since there are inequality constraints, the KKT condition is followed:
⎧⎩⎨⎪⎪⎪⎪gj(x)≤0;μj≥0;μjgj(x)=0.
Solving the original optimal problem, also known as primal problem, can be achieved by solving the corresponding dual problem. Then the Lagrange dual function Γ:Rm×Rn→R is defined as
Γ(λ,μ)=infx∈DL(x,λ,μ)=infx∈D⎛⎝f(x)+∑i=1mλihi(x)+∑j=1nμjgj(x)⎞⎠
Evidently, ∑mi=1λihi(x)+∑nj=1μjgj(x)≤0. Let x~∈D, then
Γ(λ,μ)=infx∈DL(x,λ,μ)≤L(x~,λ,μ)≤f(x~)That is to say, the dual function shows the lower bound of the value of the primal problem. Then the dual problem is given by
maxλ,μΓ(λ,μ)s.t.μ≥0
The dual problem is always a convex optimal problem, regardless of the convexity of the primal problem.
Let p∗ be the optimal value of the primal problem, d∗ be the optimal value of the dual problem. It has been shown that d∗≤p∗ which is also known as weak duality. If d∗=p∗, then strong duality holds. Generally, the strong duality does not always hold. However, when the primal problem is a convex optimal problem, i.e. f(x),gj(x) are convex functions, hi(x) are affine functions and there exists a least one x~ making the inequality constraints strictly hold, the strong duality holds. In this case, take the derivative of the Lagrange function over x, λ and μ and set it to be zero. Then we can solve the dual problem as well as the primal problem.