Fenchel-Rockafellar and Linear Programming: Difference between revisions
(Created page, proof sketch section incomplete) |
(Finished/corrected proof sketch) |
||
Line 10: | Line 10: | ||
Details may be found in Brezis' text<ref name="Brezis" />, but a sketch of the proof follows. | Details may be found in Brezis' text<ref name="Brezis" />, but a sketch of the proof follows. | ||
By Young's Inequality, for any <math>x \in X</math> and <math>u \in X^*</math>, we have <math>\phi(x) + \phi^*(-u) + \psi(x) + \psi^*(u) \ | By Young's Inequality, for any <math>x \in X</math> and <math>u \in X^*</math>, we have <math>\phi(x) + \phi^*(-u) + \psi(x) + \psi^*(u) \geq \langle x, -u \rangle + \langle x, u \rangle = 0</math>, so the infimum on the left is greater than or equal to the supremum on the right. | ||
If the | If the infimum is <math>-\infty</math>, equality must be obtained and every <math>u \in X^*</math> must realize the supremum; otherwise assume <math>\phi(x) + \psi(x) > -\infty</math> everywhere, and put <math>m</math> for the value of the infimum. | ||
( | |||
Let <math display="block">A := \left\{(x, t) : \phi(x) \leq t\right\},</math> and observe that since <math>\phi</math> is continuous at <math>x_0</math>, the interior of <math>C</math> is nonempty. | |||
Likewise, let <math display="block">B := \left\{(x, t) : t \leq m - \psi(x)\right\}.</math> | |||
Both sets are convex, and <math>B \cap \text{int}A = \emptyset</math> by construction, so a [https://en.wikipedia.org/wiki/Hahn%E2%80%93Banach_theorem#Separation_of_sets geometric Hahn-Banach Theorem] gives that there is some <math>(f, k) \in X^* \times \mathbb{R}</math> nonzero and <math>\alpha \in \mathbb{R}</math> such that <math display="block">\begin{cases} \langle x, f \rangle + kt \geq \alpha, & (x, t) \in A, \\ \langle x, f \rangle + kt \leq \alpha, & (x, t) \in B.\end{cases}</math> | |||
Since <math>\phi</math> is finite at <math>x_0</math>, letting <math>t \to +\infty</math> shows that <math>k</math> is nonnegative, and if <math>k = 0</math>, continuity and joint finiteness imply <math>\lVert f \rVert = 0</math>, a contradiction, so <math>k > 0</math>. | |||
Thus, we can use the inequalities above with the definitions of <math>\phi^*</math> and <math>\psi^*</math> to conclude that <math display="block">\begin{cases}\phi^*\left(-\frac f k\right) &\leq -\frac \alpha k, \\ \psi^*\left(\frac f k\right) &\leq \frac \alpha k - m.\end{cases}</math> | |||
This shows that <math display="block">-\phi^*\left(-\frac f k\right) - \psi^*\left(\frac f k\right) \geq m.</math> | |||
Since the supremum includes this term, it must also be greater than or equal to the infimum, yielding equality. | |||
Line 21: | Line 30: | ||
Then if we put <math display="block">\phi(x) + \psi(x) := \left(\langle c, x \rangle + \chi_X(x)\right) + \theta_Y(b - Ax),</math> (noting that <math>\langle c, x \rangle + \chi_X(x)</math> and <math>\theta_Y(b-Ax)</math> are both convex lower semicontinuous proper functions), we are interested in computing <math>\phi^*(u) = \sup\left\{\langle x, u - c \rangle : x \in X \right\} = \theta_X(u-c)</math> and <math>\psi^*(u) = \sup\left\{\langle x, u \rangle - \sup\left\{ \langle y, b-Ax \rangle : y \in Y \right\} : z \in \mathbb{R}^n \right\}</math>. | Then if we put <math display="block">\phi(x) + \psi(x) := \left(\langle c, x \rangle + \chi_X(x)\right) + \theta_Y(b - Ax),</math> (noting that <math>\langle c, x \rangle + \chi_X(x)</math> and <math>\theta_Y(b-Ax)</math> are both convex lower semicontinuous proper functions), we are interested in computing <math>\phi^*(u) = \sup\left\{\langle x, u - c \rangle : x \in X \right\} = \theta_X(u-c)</math> and <math>\psi^*(u) = \sup\left\{\langle x, u \rangle - \sup\left\{ \langle y, b-Ax \rangle : y \in Y \right\} : z \in \mathbb{R}^n \right\}</math>. | ||
By writing the latter as <math>\psi^*(u) = \sup\left\{\inf\left\{\langle x, u + A^*y \rangle - \langle y, b \rangle : y \in Y\right \} : z \in \mathbb{R}^n \right\}</math>, we can see that we have a finite value of <math>\psi^*(u)</math> if and only if <math>u = -A^*y_0</math> for some <math>y_0 \in Y</math>, and the value is then the infimum of <math>-\langle y, b \rangle</math> where <math>A^*y = A^*y_0</math>. | By writing the latter as <math>\psi^*(u) = \sup\left\{\inf\left\{\langle x, u + A^*y \rangle - \langle y, b \rangle : y \in Y\right \} : z \in \mathbb{R}^n \right\}</math>, we can see that we have a finite value of <math>\psi^*(u)</math> if and only if <math>u = -A^*y_0</math> for some <math>y_0 \in Y</math>, and the value is then the infimum of <math>-\langle y, b \rangle</math> where <math>A^*y = A^*y_0</math>. | ||
Putting this together, the Fenchel-Rockafellar Theorem gives that <math display="block">\inf\left\{\langle c, x \rangle + \theta_Y(b-Ax) : x \in X\right\} = \max\left\{\langle y, b \rangle - \theta_X(A^*y - c) : y \in Y\right\}, </math> provided we can find a point of continuity and mutual finiteness for <math>\phi</math> and <math>\psi</math>. | Putting this together, the Fenchel-Rockafellar Theorem gives that <math display="block">\inf\left\{\langle c, x \rangle + \theta_Y(b-Ax) : x \in X\right\} = \max\left\{\langle y, b \rangle - \theta_X(A^*y - c) : y \in Y\right\}, </math> provided we can find a point of continuity for <math>\phi</math> and mutual finiteness for <math>\phi</math> and <math>\psi</math>. | ||
To do this, we use our knowledge of <math>X</math> and <math>Y</math> to be more precise about the <math>\theta</math> functions. | To do this, we use our knowledge of <math>X</math> and <math>Y</math> to be more precise about the <math>\theta</math> functions. | ||
In particular, if we write <math>(\pi_1^Y, \pi_2^Y) : Y \to \mathbb{R}^s \times \mathbb{R}^{m-s}</math> and <math>(\pi_1^X, \pi_2^X) : X \to \mathbb{R}^r \times \mathbb{R}^{n-r}</math> for the projections, and put <math>(b_1, b_2) = (\pi_1^Y, \pi_2^Y)b</math>, <math>(A_1, A_2) = (\pi_1^Y, \pi_2^Y) \circ A</math>, <math>(c_1, c_2) = (\pi_1^X, \pi_2^X)c</math>, and <math>(A_1^*, A_2^*) = (\pi_1^X, \pi_2^X) \circ A^*</math> (which do satisfy <math>A_i^* = (A_i)^*</math>, justifying the notation) we get (writing inequalities termwise) <math display="block">\theta_Y(b-Ax) = \begin{cases} 0, & b_1 \leq A_1x \land b_2 = A_2x, \\ +\infty, & \text{else}, \end{cases} \qquad \theta_X(A^*y - c) = \begin{cases} 0, & c_1 \geq A_1^*y \land c_2 = A_2^*y \\ +\infty, & \text{else}.\end{cases}</math> | In particular, if we write <math>(\pi_1^Y, \pi_2^Y) : Y \to \mathbb{R}^s \times \mathbb{R}^{m-s}</math> and <math>(\pi_1^X, \pi_2^X) : X \to \mathbb{R}^r \times \mathbb{R}^{n-r}</math> for the projections, and put <math>(b_1, b_2) = (\pi_1^Y, \pi_2^Y)b</math>, <math>(A_1, A_2) = (\pi_1^Y, \pi_2^Y) \circ A</math>, <math>(c_1, c_2) = (\pi_1^X, \pi_2^X)c</math>, and <math>(A_1^*, A_2^*) = (\pi_1^X, \pi_2^X) \circ A^*</math> (which do satisfy <math>A_i^* = (A_i)^*</math>, justifying the notation) we get (writing inequalities termwise) <math display="block">\theta_Y(b-Ax) = \begin{cases} 0, & b_1 \leq A_1x \land b_2 = A_2x, \\ +\infty, & \text{else}, \end{cases} \qquad \theta_X(A^*y - c) = \begin{cases} 0, & c_1 \geq A_1^*y \land c_2 = A_2^*y \\ +\infty, & \text{else}.\end{cases}</math> | ||
So, if some point in the interior of <math>X</math> is mapped strictly greater than <math>b_1 | So, if some point in the interior of <math>X</math> is mapped strictly greater than <math>b_1</math> and onto <math>b_2</math> by <math>A</math>, we may write the duality result in the more customary way, <math display="block"> \inf_{x \in X, A_1x \geq b_1, A_2x = b_2} \langle c, x \rangle = \max_{y \in Y, A_1^*y \leq c_1, A_2^*y = c_2} \langle y, b \rangle.</math> | ||
If some point in the interior of <math>Y</math> is mapped strictly less than <math>c_1</math> and onto <math>c_2</math> by <math>A^*</math>, then reversing the roles of primal and dual problems by taking negatives gives that the infimum is achieved as well. | |||
Revision as of 22:40, 22 June 2020
The Fenchel-Rockafellar Theorem is a minimax principle especially suited to converting between primal and dual linear programs.
The Fenchel-Rockafellar Theorem
Suppose and are convex, lower semicontinuous, proper functions which are both finite at some , and is continuous there. Then
Proof Sketch
Details may be found in Brezis' text[1], but a sketch of the proof follows.
By Young's Inequality, for any and , we have , so the infimum on the left is greater than or equal to the supremum on the right. If the infimum is , equality must be obtained and every must realize the supremum; otherwise assume everywhere, and put for the value of the infimum.
Let
Both sets are convex, and by construction, so a geometric Hahn-Banach Theorem gives that there is some nonzero and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha \in \mathbb{R}} such that
Thus, we can use the inequalities above with the definitions of and to conclude that
Application to Linear Programs
Consider the linear program[2]
To do this, we use our knowledge of and to be more precise about the functions. In particular, if we write and for the projections, and put , , , and (which do satisfy , justifying the notation) we get (writing inequalities termwise)