Kantorovich Dual Problem (for general costs): Difference between revisions

From Optimal Transport Wiki
Jump to navigation Jump to search
m (Removed protection from "Kantorovich Dual Problem (for general costs)")
 
(68 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==Introduction==
The Kantorovich Dual Problem is one of the minimization problems in [http://34.106.105.83/wiki/Main_Page Optimal Transport]. It is a dual problem of the [http://34.106.105.83/wiki/ Kantorovich Problem].
 
The main advantage of Kantorovich Problem, in comparison to Monge problem, is in the convex constraint property. It is possible to formulate the dual problem. It is formulated in the very general metric spaces called Polish spaces, i.e. complete separable.


==The Shipper's Problem==
==The Shipper's Problem==


Type of this problem is stated by Caffarelli. We will provide the modern one.  
One of the ways to understand this problem is stated by Caffarelli. The statement is presented in the book by Villani <ref name=Villani />. We will provide the modern rephrase of his statement.


During the pandemic, people are in the lockdown and it is the best time to enjoy a coffee time. All in all, it costs Amazon <math> c(x,y) </math> dollars to ship one box of necessary espresso capsules from place <math> x </math> to  place <math> y </math>, i.e. from warehouses to homes. We want to optimize this expensive habit and consequently to solve appropriate Monge-Kantorovich problem. The mathematicians come to Amazon and propose the new kind of payment. For every box at place <math> x </math> they will charge <math> \varphi(x) </math> dollars and <math> \psi(y) </math> dollars to deliver at place <math> y </math>. However, mathematicians will not reveal their shipping routes. Of course, in order for Amazon to accept this offer, the price <math> \varphi(x)+\psi(y) \leq c(x,y).</math>  The moral is that if the mathematicians are smart enough, they will be capable to make this shipment cheaper. This is provided by Kantorovich duality theorem. Take care that in the same cases, mathematicians will also give negative prices!
During the [https://en.wikipedia.org/wiki/ COVID-19 pandemic], people are in the lockdown and it is the best time to enjoy a coffee time. All in all, it costs Amazon <math> c(x,y) </math> dollars to ship one box of necessary espresso capsules from place <math> x </math> to  place <math> y </math>, i.e. from warehouses to homes. We want to optimize this expensive habit and consequently to solve appropriate Monge-Kantorovich problem. The mathematicians come to Amazon and propose the new kind of payment. For every box at place <math> x </math> they will charge <math> \varphi(x) </math> dollars and <math> \psi(y) </math> dollars to deliver at place <math> y </math>. However, mathematicians will not reveal their shipping routes. Of course, in order for Amazon to accept this offer, the price <math> \varphi(x)+\psi(y) \leq c(x,y).</math>  The moral is that if the mathematicians are smart enough, they will be capable to make this shipment cheaper. This is provided by Kantorovich duality theorem. Take care that in the same cases, mathematicians will also give negative prices, if it is necessary!


==Statement of Theorem==
==Statement of Theorem==
This is the statement of the Theorem in the book "Topics in Optimal Transportation", by Cedric Villani.


: '''Theorem.'''<ref name=Villani /> Let X and Y be Polish spaces, let <math>\mu \in \mathcal{P}(X)</math> and <math>\nu \in \mathcal{P}(Y)</math>, and let a cost function <math> c:X \times Y \rightarrow[0,+\infty] </math> be lower semi-continuous.
: '''Theorem.'''<ref name=Villani /> Let X and Y be Polish spaces, let <math>\mu \in \mathcal{P}(X)</math> and <math>\nu \in \mathcal{P}(Y)</math>, and let a cost function <math> c:X \times Y \rightarrow[0,+\infty] </math> be lower semi-continuous.
Line 16: Line 16:
<math> I[\pi]= \int_{X\times Y} c(x,y) d\pi(x,y), \quad J(\varphi,\psi)=\int_{X}\varphi(x)d\mu(x)+\int_{Y}\psi(y) d\nu(y) </math>.
<math> I[\pi]= \int_{X\times Y} c(x,y) d\pi(x,y), \quad J(\varphi,\psi)=\int_{X}\varphi(x)d\mu(x)+\int_{Y}\psi(y) d\nu(y) </math>.


Define <math> \Pi(\mu,\nu) </math> to be the set of Borel probability measures <math> \pi </math> on <math> X\times Y </math> such that for all measurable sets <math> A \subset X </math> and <math> B \subset Y </math>, <br>  
Define <math> \Pi(\mu,\nu) </math> to be the set of Borel probability measures <math> \pi </math> on <math> X\times Y </math> such that for all measurable sets <math> A \subset X </math> and <math> B \subset Y </math>, <math> \pi[A\times Y]=\mu(A) </math>, <math> \pi[X\times B]=\nu(B) </math>, and define <math> \Phi_{c} </math> to be the set of all measurable functions <math> (\varphi, \psi) \in L^{1}(d\mu) \times L^{1}(d\nu) </math> satisfying <math> \varphi(x)+\psi(y) \leq c(x,y) </math> for <math> d\mu </math> almost everywhere in X and <math> d\nu </math> almost everywhere in Y. <br>


<math> \pi[A\times Y]=\mu(A) </math>, <math> \pi[X\times B]=\nu(B) </math>, <br>  
Then <math> \inf_{\Pi(\mu,\nu)} I[\pi] = \sup_{\Phi_{c}} J(\varphi,\psi) </math>. <br>


and define <math> \Phi_{c} </math> to be the set of all measurable functions <math> (\varphi, \psi) \in L^{1}(d\mu) \times L^{1}(d\nu) </math> satisfying <math> \varphi(x)+\psi(y) \leq c(x,y) </math> for <math> d\mu </math> almost everywhere in X and <math> d\nu </math> almost everywhere in Y. <br>
Moreover, the infimum <math> \inf_{\Pi(\mu,\nu)} I[\pi] </math> is attained. In addition it is possible to restrict <math> \varphi </math> and <math> \psi </math> to be continuous and bounded.


Then <math> inf_{\Pi(\mu,\nu)} I[\pi] = sup_{\Phi_{c}} J(\varphi,\psi) </math>. <br>
==Ideas and the techniques used in the proof==
 
Moreover, the infimum <math> inf_{\Pi(\mu,\nu)} I[\pi] </math> is attained. In addition it is possible to restrict <math> \varphi </math> and <math> \psi </math> to be continuous and bounded.
 
==Outline of the Proof==


First, we assume that our spaces <math> X </math> and <math> Y </math> are compact and that the cost function <math> c(x,y) </math> is continuous. The general case follows by an approximation argument.   
First, we assume that our spaces <math> X </math> and <math> Y </math> are compact and that the cost function <math> c(x,y) </math> is continuous. The general case follows by an approximation argument.   


The main idea is to use minimax principle, i.e. interchanging inf sup with sup inf in the proof.
The main idea is to use minimax principle, i.e. interchanging inf sup with sup inf in the proof.
For this, we need some basic convex analysis techniques, namely Legendre-Fenchel transform (qoute needed) and Theorem on Fenchel-Rockafellar Duality (its proof is based on Hahn-Banach theorem consequence on separating convex sets.)
For this, we need some basic convex analysis techniques, namely Legendre-Fenchel transform and Theorem on Fenchel-Rockafellar Duality (its proof is based on Hahn-Banach theorem [https://en.wikipedia.org/wiki/Hahn%E2%80%93Banach_theorem] consequence on separating convex sets). The required statements can be found in the book by Rockafellar<ref name=Rockafellar /> and the book by Bauschke and Combettes<ref name=BandC />.


Take a note that at some point we use Arzela-Ascoli Theorem. In a non-compact space this is not possible.
Take a note that at some point we use Arzela-Ascoli Theorem [https://en.wikipedia.org/wiki/Arzel%C3%A0%E2%80%93Ascoli_theorem]. In a non-compact space this is not possible.
In order to evade compactness property, we have to use Prokhorov's theorem.
In order to evade compactness property, we have to use Prokhorov's theorem.
: '''Theorem.'''<ref name=Santambrogio /> Let <math> \mu_{n} </math> be a sequence of tight probability measures on Polish space <math> X </math>. Then, there exists <math> \mu \in P(X) </math> and convergent subsequence <math> \mu_{n_{k}}</math> such that <math> \mu_{n_{k}} \rightarrow \mu </math> in the dual of <math> C_{b}(X) </math>. Conversely, every sequence <math> \mu_{n} \rightarrow \mu </math> is tight.
The proof of the previous Theorem can be found in <ref name=Santambrogio />. For more information on <math> C_{b}(X) </math> duality, take a look at [http://34.106.105.83/wiki/ Dual space of C_0(x) vs C_b(x)].


==C-concave functions==
==C-concave functions==


In Kantorovich Duality Theorem, the left-hand side of the last equality, the infimum <math> inf_{\Pi(\mu,\nu)} I[\pi] </math> is attained. We do not know something similar about the right-hand side. However, when cost function <math> c(x,y) </math> is bounded we can restrict <math >sup_{\Phi_{c}} J(\varphi,\psi) </math> to pairs <math> (\varphi^{cc},\varphi^{c}) </math> where <math> \varphi </math> is bounded and <br>
There are a few alternative proofs of the above Theorem. First, we will discuss the conclusion of the Theorem. Again, we will follow the path given in the book by Villani <ref name=Villani />.
 
In Kantorovich Duality Theorem, the left-hand side of the last equality, the infimum <math> \inf_{\Pi(\mu,\nu)} I[\pi] </math> is attained. We do not know anything similar about the right-hand side. However, when cost function <math> c(x,y) </math> is bounded we can restrict <math >\sup_{\Phi_{c}} J(\varphi,\psi) </math> to pairs <math> (\varphi^{cc},\varphi^{c}) </math> where <math> \varphi </math> is bounded and <br>
 
<math> \varphi^{c}(y)=\inf_{y \in Y} [c(x,y) - \varphi(x)], \quad \varphi^{cc}(x)=\inf_{y \in Y} [c(x,y) - \varphi^{c}(y)]. </math>
 
The pair <math> (\varphi^{cc},\varphi^{c}) </math> is called a pair of conjugate c-concave functions. It is known that <math> (\varphi^{cc})^{c}=\varphi^{c} </math> and that <math> \varphi^{c} </math> is measurable. The proof can be found in the book <ref name=Santambrogio />(Chapter 1, p.27).


<math> \varphi^{c}(y)=inf_{y \in Y} [c(x,y) - \varphi(x)], \quad \varphi^{cc}(x)=inf_{y \in Y} [c(x,y) - \varphi^{c}(y)]. </math>
In addition, it is possible to give a proof of Kantorovich Duality theorem using c-concave functions. Namely, we can find c-concave function <math> \varphi </math> such that <br> <math> (x,y) \in \Gamma \implies \varphi(x)+\varphi^{c}(y)=c(x,y). </math> Here, <math> \Gamma </math> is the support for the Kantorovich Problem (it has to be non-empty). The proof of this fact can also be found in the book <ref name=Santambrogio />(Chapter 1, p.11).


The pair <math> (\varphi^{cc},\varphi^{c}) </math> is called a pair of conjugate c-concave functions. It is known that <math> (\varphi^{cc})^{c}=\varphi^{c} </math> and that <math> \varphi^{c} </math> is measurable.
= References =


In addition, it is possible to give a proof of Kantorovich Duality theorem using c-concave functions. Namely, we can find c-concave function <math> \varphi </math> such that <br> <math> (x,y) \in \Gamma \implies \varphi(x)+\varphi^{c}(y)=c(x,y). </math> This should lead to much faster proof.
<references>


==References==
<ref name="Villani">[https://people.math.gatech.edu/~gangbo/Cedric-Villani.pdf C. Villani, ''Topics in Optimal Transportation'', Chapter 1, pages 17-21]</ref>


<references />
<ref name="Santambrogio"> [https://link-springer-com.proxy.library.ucsb.edu:9443/book/10.1007/978-3-319-20828-2 F. Santambrogio, ''Optimal Transport for Applied Mathematicians'', Chapter 1, pages 9-16] </ref>


<ref name="Villani">[https://people.math.gatech.edu/~gangbo/Cedric-Villani.pdf C. Villani, ''Topics in Optimal Transportation'', Chapter 1.] (pages 17-21)</ref>
<ref name="BandC"> [https://link.springer.com/book/10.1007/978-3-319-48311-5 Heinz H. Bauschke, Patrick L. Combettes, ''Convex Analysis and Monotone Operator Theory in Hilbert Spaces''] </ref>


<ref name="Santambrogio">https://link-springer-com.proxy.library.ucsb.edu:9443/book/10.1007/978-3-319-20828-2 F. Santambrogio, ''Optimal Transport for Applied Mathematicians'', Chapter 1.] (pages 9-16)</ref>
<ref name="Rockafellar"> R.T. Rockafellar,''Convex Analysis'', Princeton University Press, Princeton, 1970  </ref>


</ references>
</references>

Latest revision as of 04:36, 28 February 2022

The Kantorovich Dual Problem is one of the minimization problems in Optimal Transport. It is a dual problem of the Kantorovich Problem.

The Shipper's Problem

One of the ways to understand this problem is stated by Caffarelli. The statement is presented in the book by Villani [1]. We will provide the modern rephrase of his statement.

During the COVID-19 pandemic, people are in the lockdown and it is the best time to enjoy a coffee time. All in all, it costs Amazon dollars to ship one box of necessary espresso capsules from place to place , i.e. from warehouses to homes. We want to optimize this expensive habit and consequently to solve appropriate Monge-Kantorovich problem. The mathematicians come to Amazon and propose the new kind of payment. For every box at place they will charge dollars and dollars to deliver at place . However, mathematicians will not reveal their shipping routes. Of course, in order for Amazon to accept this offer, the price The moral is that if the mathematicians are smart enough, they will be capable to make this shipment cheaper. This is provided by Kantorovich duality theorem. Take care that in the same cases, mathematicians will also give negative prices, if it is necessary!

Statement of Theorem

This is the statement of the Theorem in the book "Topics in Optimal Transportation", by Cedric Villani.

Theorem.[1] Let X and Y be Polish spaces, let and , and let a cost function be lower semi-continuous.

Whenever and , define

.

Define to be the set of Borel probability measures on such that for all measurable sets and , , , and define to be the set of all measurable functions satisfying for almost everywhere in X and almost everywhere in Y.

Then .

Moreover, the infimum is attained. In addition it is possible to restrict and to be continuous and bounded.

Ideas and the techniques used in the proof

First, we assume that our spaces and are compact and that the cost function is continuous. The general case follows by an approximation argument.

The main idea is to use minimax principle, i.e. interchanging inf sup with sup inf in the proof. For this, we need some basic convex analysis techniques, namely Legendre-Fenchel transform and Theorem on Fenchel-Rockafellar Duality (its proof is based on Hahn-Banach theorem [1] consequence on separating convex sets). The required statements can be found in the book by Rockafellar[2] and the book by Bauschke and Combettes[3].

Take a note that at some point we use Arzela-Ascoli Theorem [2]. In a non-compact space this is not possible. In order to evade compactness property, we have to use Prokhorov's theorem.

Theorem.[4] Let be a sequence of tight probability measures on Polish space . Then, there exists and convergent subsequence such that in the dual of . Conversely, every sequence is tight.

The proof of the previous Theorem can be found in [4]. For more information on duality, take a look at Dual space of C_0(x) vs C_b(x).

C-concave functions

There are a few alternative proofs of the above Theorem. First, we will discuss the conclusion of the Theorem. Again, we will follow the path given in the book by Villani [1].

In Kantorovich Duality Theorem, the left-hand side of the last equality, the infimum is attained. We do not know anything similar about the right-hand side. However, when cost function is bounded we can restrict to pairs where is bounded and

The pair is called a pair of conjugate c-concave functions. It is known that and that is measurable. The proof can be found in the book [4](Chapter 1, p.27).

In addition, it is possible to give a proof of Kantorovich Duality theorem using c-concave functions. Namely, we can find c-concave function such that
Here, is the support for the Kantorovich Problem (it has to be non-empty). The proof of this fact can also be found in the book [4](Chapter 1, p.11).

References