Purpose: Generally, algorithms such as MLE can only optimize unconstrained expressions, such as maxω1ml(ω;xi,ji)\max_\omega \frac{1}{m}\sum l(\omega; x_i, j_i). However, when encountering algorithms with constraints, one needs to employ Duality, LP, QP to solve such problems. Additionally, Duality, LP, QP can also address the first type of problems without constraints.

Duality Problem

Problem Type

minωJ(ω)s.t. ci(ω)0, for ijei(ω)=0, for iϵ——(0)\min_\omega J(\omega)\\ \text{s.t. } c_i(\omega)\leq 0 \text{, for } i\in {j}\\ e_i(\omega)= 0 \text{, for } i\in {\epsilon}——(0)

Augmented Lagrange

L(ω,α,β)=J(ω)+ijαici(ω)+iϵβiei(ω)——(1)L(\omega, \alpha, \beta) = J(\omega)+\sum_{i\in j}\alpha_i c_i(\omega)+\sum_{i\in \epsilon}\beta_ie_i(\omega)——(1)

Dual feasible means αi0,βiR\alpha_i\geq0, \beta_i\in \mathbb{R}

Theorem 1

From equation (1), it follows that

maxα0,βL(ω,α,β)={J(ω), if ω is feasible,otherwise——(2)\max_{\alpha\geq0, \beta}L(\omega,\alpha,\beta)=\left\{ \begin{matrix} J(\omega) \text{, if $\omega$ is feasible}\\ \infty, \text{otherwise} \end{matrix} \right.——(2)

Especially, if J=minωJ(ω)J^*=\min_\omega J(\omega), and ci(ω)0c_i(\omega)\leq 0 and ei(ω)=0e_i(\omega)=0, then

J=minωmaxα0,βL(ω,α,β)J^*=\min_\omega \max_{\alpha\geq0, \beta} L(\omega,\alpha,\beta)

Refer to the attached PDF for proof.

Theorem 2 | Weak Duality

Define the Lagrange dual function as follows

D(α,β)=minωL(ω,α,β)D(\alpha, \beta) = \min_\omega L(\omega, \alpha, \beta)

Then, the following holds

D(α,β)J(ω)D(\alpha, \beta)\leq J(\omega)

for all feasible $ \omega $and α0\alpha \geq 0 and β\beta. Particularly,

D:=maxα0,βminωL(ω,α,β)minωmaxα0,βL(ω,α,β)=JD^*:=\max_{\alpha\geq0, \beta} \min_\omega L(\omega,\alpha,\beta)\leq\min_\omega \max_{\alpha\geq0, \beta} L(\omega,\alpha,\beta)=J^*

Note: Weak duality doesn’t necessarily require J(ω)J(\omega) to be convex.

Theorem 3 | Strong Duality

If J(ω),ci,eiJ(\omega), c_i, e_i are all convex, and the following constraint qualification holds:

There exists an ω\omega such that ci(ω)<0c_i(\omega)<0, for all iji\in j

Then, equation (1) satisfies

D:=maxα0,βminωL(ω,α,β)=minωmaxα0,βL(ω,α,β)=JD^*:=\max_{\alpha\geq0, \beta} \min_\omega L(\omega,\alpha,\beta)=\min_\omega \max_{\alpha\geq0, \beta} L(\omega,\alpha,\beta)=J^*

Note that D=maxα>0,βD(α,β)D^* = \max_{\alpha>0, \beta} D(\alpha, \beta) represents the case when the duality problem is optimal, and J=minωJ(ω)J^* = \min_{\omega}J(\omega) represents the original problem being optimal.

KKT Condition

KKT Conditions

If the primal problem is convex, then the KKT conditions are both necessary and sufficient. That is, if ω^\hat\omega and (α^,β^)(\hat\alpha, \hat\beta) satisfy the KKT conditions, then ω^\hat\omega and (α^,β^)(\hat\alpha, \hat\beta) are primal and dual optimal, i.e.,

KKT Conditions Equations

Linear Problem

Problem satisfies

min cTwJ(ω)s.t. Aω=bej(ω)ω0ci(ω)\min \ c^Tw \Rightarrow J(\omega)\\ \text{s.t.} \ A\omega=b \Rightarrow e_j(\omega)\\ \omega \geq 0 \Rightarrow c_i(\omega)

$ D(\alpha, \beta) $for LP is still a LP

Quadratic Problem

Problem satisfies

minωTGω+ωTds.t. Aω=bω0\min \omega^TG\omega+\omega^Td\\ \text{s.t.}\ A\omega=b\\ \omega \geq0

Refer to the attached PDF for specific details on how to apply duality.

Summary

Summary

Example Problem

General approach:

  1. Write L;

  2. D=min(L)D=\min(L): Take partial derivatives of w (sometimes there are two), express ww in terms of α\alpha and β\beta to obtain DD;

  3. maxD\max D gives D,α,βD^*,\alpha^*,\beta^*: Take derivatives of D or use KKT conditions to find D,α,βD^*, \alpha^*, \beta^*;

  4. JJ^*, ww^*: Substitute into the expression obtained in step 2 to find ww^*. J=DJ^*=D^*.


Note: The content in this blog is class notes shared for educational purposes only. Some images and content are sourced from textbooks, teacher materials, and the internet. If there is any infringement, please contact aursus.blog@gmail.com for removal.