Fenchel-Rockafellar and Linear Programming: Difference between revisions

From Optimal Transport Wiki
Jump to navigation Jump to search
(Created page, proof sketch section incomplete)
 
(Finished/corrected proof sketch)
Line 10: Line 10:
Details may be found in Brezis' text<ref name="Brezis" />, but a sketch of the proof follows.
Details may be found in Brezis' text<ref name="Brezis" />, but a sketch of the proof follows.


By Young's Inequality, for any <math>x \in X</math> and <math>u \in X^*</math>, we have <math>\phi(x) + \phi^*(-u) + \psi(x) + \psi^*(u) \leq \langle x, -u \rangle + \langle x, u \rangle = 0</math>, so the infimum on the left is less than or equal to the supremum on the right.
By Young's Inequality, for any <math>x \in X</math> and <math>u \in X^*</math>, we have <math>\phi(x) + \phi^*(-u) + \psi(x) + \psi^*(u) \geq \langle x, -u \rangle + \langle x, u \rangle = 0</math>, so the infimum on the left is greater than or equal to the supremum on the right.
If the supremum is <math>-\infty</math>, equality must be obtained and every <math>u \in X^*</math> must realize it; otherwise assume <math>-\phi(-u) - \psi(u) > -\infty</math> everywhere.
If the infimum is <math>-\infty</math>, equality must be obtained and every <math>u \in X^*</math> must realize the supremum; otherwise assume <math>\phi(x) + \psi(x) > -\infty</math> everywhere, and put <math>m</math> for the value of the infimum.
(INCOMPLETE)
 
Let <math display="block">A := \left\{(x, t) : \phi(x) \leq t\right\},</math> and observe that since <math>\phi</math> is continuous at <math>x_0</math>, the interior of <math>C</math> is nonempty.
Likewise, let <math display="block">B := \left\{(x, t) : t \leq m - \psi(x)\right\}.</math>
 
Both sets are convex, and <math>B \cap \text{int}A = \emptyset</math> by construction, so a [https://en.wikipedia.org/wiki/Hahn%E2%80%93Banach_theorem#Separation_of_sets geometric Hahn-Banach Theorem] gives that there is some <math>(f, k) \in X^* \times \mathbb{R}</math> nonzero and <math>\alpha \in \mathbb{R}</math> such that <math display="block">\begin{cases} \langle x, f \rangle + kt \geq \alpha, & (x, t) \in A, \\ \langle x, f \rangle + kt \leq \alpha, & (x, t) \in B.\end{cases}</math>
Since <math>\phi</math> is finite at <math>x_0</math>, letting <math>t \to +\infty</math> shows that <math>k</math> is nonnegative, and if <math>k = 0</math>, continuity and joint finiteness imply <math>\lVert f \rVert = 0</math>, a contradiction, so <math>k > 0</math>.
 
Thus, we can use the inequalities above with the definitions of <math>\phi^*</math> and <math>\psi^*</math> to conclude that <math display="block">\begin{cases}\phi^*\left(-\frac f k\right) &\leq -\frac \alpha k, \\ \psi^*\left(\frac f k\right) &\leq \frac \alpha k - m.\end{cases}</math>
This shows that <math display="block">-\phi^*\left(-\frac f k\right) - \psi^*\left(\frac f k\right) \geq m.</math>
Since the supremum includes this term, it must also be greater than or equal to the infimum, yielding equality.




Line 21: Line 30:
Then if we put <math display="block">\phi(x) + \psi(x) := \left(\langle c, x \rangle + \chi_X(x)\right) + \theta_Y(b - Ax),</math> (noting that <math>\langle c, x \rangle + \chi_X(x)</math> and <math>\theta_Y(b-Ax)</math> are both convex lower semicontinuous proper functions), we are interested in computing <math>\phi^*(u) = \sup\left\{\langle x, u - c \rangle : x \in X \right\} = \theta_X(u-c)</math> and <math>\psi^*(u) = \sup\left\{\langle x, u \rangle - \sup\left\{ \langle y, b-Ax \rangle : y \in Y \right\} : z \in \mathbb{R}^n \right\}</math>.
Then if we put <math display="block">\phi(x) + \psi(x) := \left(\langle c, x \rangle + \chi_X(x)\right) + \theta_Y(b - Ax),</math> (noting that <math>\langle c, x \rangle + \chi_X(x)</math> and <math>\theta_Y(b-Ax)</math> are both convex lower semicontinuous proper functions), we are interested in computing <math>\phi^*(u) = \sup\left\{\langle x, u - c \rangle : x \in X \right\} = \theta_X(u-c)</math> and <math>\psi^*(u) = \sup\left\{\langle x, u \rangle - \sup\left\{ \langle y, b-Ax \rangle : y \in Y \right\} : z \in \mathbb{R}^n \right\}</math>.
By writing the latter as <math>\psi^*(u) = \sup\left\{\inf\left\{\langle x, u + A^*y \rangle - \langle y, b \rangle : y \in Y\right \} : z \in \mathbb{R}^n \right\}</math>, we can see that we have a finite value of <math>\psi^*(u)</math> if and only if <math>u = -A^*y_0</math> for some <math>y_0 \in Y</math>, and the value is then the infimum of <math>-\langle y, b \rangle</math> where <math>A^*y = A^*y_0</math>.
By writing the latter as <math>\psi^*(u) = \sup\left\{\inf\left\{\langle x, u + A^*y \rangle - \langle y, b \rangle : y \in Y\right \} : z \in \mathbb{R}^n \right\}</math>, we can see that we have a finite value of <math>\psi^*(u)</math> if and only if <math>u = -A^*y_0</math> for some <math>y_0 \in Y</math>, and the value is then the infimum of <math>-\langle y, b \rangle</math> where <math>A^*y = A^*y_0</math>.
Putting this together, the Fenchel-Rockafellar Theorem gives that <math display="block">\inf\left\{\langle c, x \rangle + \theta_Y(b-Ax) : x \in X\right\} = \max\left\{\langle y, b \rangle - \theta_X(A^*y - c) : y \in Y\right\}, </math> provided we can find a point of continuity and mutual finiteness for <math>\phi</math> and <math>\psi</math>.
Putting this together, the Fenchel-Rockafellar Theorem gives that <math display="block">\inf\left\{\langle c, x \rangle + \theta_Y(b-Ax) : x \in X\right\} = \max\left\{\langle y, b \rangle - \theta_X(A^*y - c) : y \in Y\right\}, </math> provided we can find a point of continuity for <math>\phi</math> and mutual finiteness for <math>\phi</math> and <math>\psi</math>.


To do this, we use our knowledge of <math>X</math> and <math>Y</math> to be more precise about the <math>\theta</math> functions.
To do this, we use our knowledge of <math>X</math> and <math>Y</math> to be more precise about the <math>\theta</math> functions.
In particular, if we write <math>(\pi_1^Y, \pi_2^Y) : Y \to \mathbb{R}^s \times \mathbb{R}^{m-s}</math> and <math>(\pi_1^X, \pi_2^X) : X \to \mathbb{R}^r \times \mathbb{R}^{n-r}</math> for the projections, and put <math>(b_1, b_2) = (\pi_1^Y, \pi_2^Y)b</math>, <math>(A_1, A_2) = (\pi_1^Y, \pi_2^Y) \circ A</math>, <math>(c_1, c_2) = (\pi_1^X, \pi_2^X)c</math>, and <math>(A_1^*, A_2^*) = (\pi_1^X, \pi_2^X) \circ A^*</math> (which do satisfy <math>A_i^* = (A_i)^*</math>, justifying the notation) we get (writing inequalities termwise) <math display="block">\theta_Y(b-Ax) = \begin{cases} 0, & b_1 \leq A_1x \land b_2 = A_2x, \\ +\infty, & \text{else}, \end{cases} \qquad \theta_X(A^*y - c) = \begin{cases} 0, & c_1 \geq A_1^*y \land c_2 = A_2^*y \\ +\infty, & \text{else}.\end{cases}</math>
In particular, if we write <math>(\pi_1^Y, \pi_2^Y) : Y \to \mathbb{R}^s \times \mathbb{R}^{m-s}</math> and <math>(\pi_1^X, \pi_2^X) : X \to \mathbb{R}^r \times \mathbb{R}^{n-r}</math> for the projections, and put <math>(b_1, b_2) = (\pi_1^Y, \pi_2^Y)b</math>, <math>(A_1, A_2) = (\pi_1^Y, \pi_2^Y) \circ A</math>, <math>(c_1, c_2) = (\pi_1^X, \pi_2^X)c</math>, and <math>(A_1^*, A_2^*) = (\pi_1^X, \pi_2^X) \circ A^*</math> (which do satisfy <math>A_i^* = (A_i)^*</math>, justifying the notation) we get (writing inequalities termwise) <math display="block">\theta_Y(b-Ax) = \begin{cases} 0, & b_1 \leq A_1x \land b_2 = A_2x, \\ +\infty, & \text{else}, \end{cases} \qquad \theta_X(A^*y - c) = \begin{cases} 0, & c_1 \geq A_1^*y \land c_2 = A_2^*y \\ +\infty, & \text{else}.\end{cases}</math>
So, if some point in the interior of <math>X</math> is mapped strictly greater than <math>b_1</math> by <math>A_1</math> and onto <math>b_2</math> by <math>A_2</math>, we may write the duality result in the more customary way, <math display="block"> \inf_{x \in X, A_1x \geq b_1, A_2x = b_2} \langle c, x \rangle = \max_{y \in Y, A_1^*y \leq c_1, A_2^*y = c_2} \langle y, b \rangle.</math>
So, if some point in the interior of <math>X</math> is mapped strictly greater than <math>b_1</math> and onto <math>b_2</math> by <math>A</math>, we may write the duality result in the more customary way, <math display="block"> \inf_{x \in X, A_1x \geq b_1, A_2x = b_2} \langle c, x \rangle = \max_{y \in Y, A_1^*y \leq c_1, A_2^*y = c_2} \langle y, b \rangle.</math>
If some point in the interior of <math>Y</math> is mapped strictly less than <math>c_1</math> and onto <math>c_2</math> by <math>A^*</math>, then reversing the roles of primal and dual problems by taking negatives gives that the infimum is achieved as well.





Revision as of 22:40, 22 June 2020

The Fenchel-Rockafellar Theorem is a minimax principle especially suited to converting between primal and dual linear programs.

The Fenchel-Rockafellar Theorem

Suppose and are convex, lower semicontinuous, proper functions which are both finite at some , and is continuous there. Then

where the existence of the minimizer on the right hand side is part of the theorem.

Proof Sketch

Details may be found in Brezis' text[1], but a sketch of the proof follows.

By Young's Inequality, for any and , we have , so the infimum on the left is greater than or equal to the supremum on the right. If the infimum is , equality must be obtained and every must realize the supremum; otherwise assume everywhere, and put for the value of the infimum.

Let

and observe that since is continuous at , the interior of is nonempty. Likewise, let

Both sets are convex, and by construction, so a geometric Hahn-Banach Theorem gives that there is some nonzero and such that

Since is finite at , letting shows that is nonnegative, and if , continuity and joint finiteness imply , a contradiction, so .

Thus, we can use the inequalities above with the definitions of and to conclude that

This shows that
Since the supremum includes this term, it must also be greater than or equal to the infimum, yielding equality.


Application to Linear Programs

Consider the linear program[2]

where , , and . Then if we put
(noting that and are both convex lower semicontinuous proper functions), we are interested in computing and . By writing the latter as , we can see that we have a finite value of if and only if for some , and the value is then the infimum of where . Putting this together, the Fenchel-Rockafellar Theorem gives that
provided we can find a point of continuity for and mutual finiteness for and .

To do this, we use our knowledge of and to be more precise about the functions. In particular, if we write and for the projections, and put , , , and (which do satisfy , justifying the notation) we get (writing inequalities termwise)

So, if some point in the interior of is mapped strictly greater than and onto by , we may write the duality result in the more customary way,
If some point in the interior of is mapped strictly less than and onto by , then reversing the roles of primal and dual problems by taking negatives gives that the infimum is achieved as well.


References