Kantorovich Problem: Difference between revisions

Latest revision as of 06:15, 17 March 2022

The Kantorovich problem ^[1] is one of the two essential minimization problems in optimal transport (the other being the Monge problem). It is named after Russian mathematician and Nobel Laureate Leonid Kantorovich.

Kantorovich Problem. The Kantorovich Problem (like the Monge Problem) can be visualized as distributing piles of sand into holes (of equal volume) in an optimal fashion. ^[2]

Introduction

In contrast to the Monge problem, the Kantorovich problem involves working with a non-empty minimization set, a convex constraint set, and a convex effort functional. The Kantrovich problem admits a dual because it is a linear minimization problem with convex constraints.

Shipping problem

The intuition behind the Kantorovich problem can be given by an explanation of optimizing shipments. Suppose there is a merchant who is attempting to ship items from one place to another. The merchant can hire trucks at some cost $c(x,y)$ for each unit of merchandise which is shipped from point $x$ to point $y$ . Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests ^[3]. This would be achieved by setting the price $\phi (x)$ and $\phi (y)$ such that the sum of $\phi (x)$ and $\phi (y)$ is always less than the cost $c(x,y)$ . This may even involve setting negative prices in certain cases. However, it can be shown that the shipper will spend almost as much as they would have if instead they opted for the original pricing method ^[4].

Kantorovich Optimal Transport Problem

Transport Plans

The Monge problem was about the optimal way to rearrange mass ^[5]. Note that in the Monge formulation of the optimal transport problem, the mass cannot be split and thus it is mapped $x\mapsto T(x)$ . When considering discrete cases, this results in problems when trying to establish maps T such that $T_{\#}\mu =\nu$ . Kantorovich made the observation that the mass in question could be split, which makes the problem much easier to model ^[6]. Allowing the mass to be split results in a relaxation of the problem (e.g. half of the mass from $x_{1}$ can go to $y_{1}$ and half can go to $y_{2}$ , and so on). To model this consider $d\pi (x,y)$ , which denotes the mass transported from x to y. This allows the mass to be moved to multiple places. Also consider $\mu (A)$ and $\nu (B)$ : where the total mass taken from measurable set $A\in X$ must be equal to $\mu (A)$ and the total mass taken from measurable set $B\in Y$ must equal $\nu (B)$ .

The constraints of the problem can be written in the following manner:

$\pi (A\times Y)=\mu (A)$

$\quad \pi (X\times B)=\nu (B)\quad$

for all measurable sets $A\subseteq X,B\subseteq Y$ . As such, we can interpret $\pi (A\times B)$ as representing the amount of mass from $\mu (A)$ that is directed to $\nu (B)$

If we have a measure $\pi$ that satisfies these constraints, then the set of such $\pi$ is referred to as $\Pi (\mu ,\nu )$ -- the set of transport plans between $\mu$ and $\nu$ . Notice again that now we are dealing with transport plans instead of the transport maps that are used in the Monge formulation of the problem ^[7].

Problem Statement

Given $\mu \in {\mathcal {P}}(X)$ and $\nu \in {\mathcal {P}}(Y)$ , solve

$\operatorname {min} \mathbb {K} (\pi ):=\operatorname {min} \int _{X\times Y}c(x,y)\mathrm {d} \pi (x,y)$

over all such $\pi \in \Pi (\mu ,\nu )$

Assuming there is a transport map $T^{\dagger }:X\rightarrow Y$ for the Monge problem, we define $\mathrm {d} \pi (x,y)=\mathrm {d} \mu (x)\delta _{y=T^{\dagger }(x)}$ . Using this we can see that:

${\begin{aligned}\pi (A\times Y)&=\int _{A}\delta _{T^{\dagger }(x)\in Y}\mathrm {d} \mu (x)=\mu (A)\\\pi (X\times B)&=\int _{X}\delta _{T^{\dagger }(x)\in B}\mathrm {d} \mu (x)=T_{\#}^{\dagger }\mu (B)=\nu (B)\end{aligned}}$

We can see that $\int _{X\times Y}c(x,y)\mathrm {d} \pi (x,y)=\int _{X}c\left(x,T^{\dagger }(x)\right)\mathrm {d} \mu (x)$

thus $\inf \mathbb {K} (\pi )\leq \inf \mathbb {M} (T)$ .

Kantorovich Duality

Since the Kantorovich problem is a linear minimization problem with convex constraints it admits a dual problem. The astute reader may notice that this is a linear programming problem -- Kantorovich is also considered to be the founder of linear programming.

Calculus of Variations Approach

Under the right setting, one can show the Kantovorich problem indeed has a minimizer using the direct method of the calculus of variations. More specifically, if one turns to the narrow topology, then it turns out that we get compactness of the constraint set. Moreover, such a topology ensures us that our objective function is lower semi-continuous.

Knott-Smith Optimality Criterion

One useful result we have that allows us to connect both the Monge and Kantorovich problems is the so-called the Knott-Smith Optimality Criterion (see below).

Statement:

Suppose $X\subset \mathbb {R} ^{d}$ is compact and $\mu ,\nu \in {\mathcal {P}}(X)$ . Let $c(x,y)=|x-y|^{2}$ . Then,

(i) There exists $f_{*}\in L^{1}(\mu )$ proper, lower semi-continuous, and convex such that

    (a)  $\sup _{\phi ,\psi \in C_{b}(\mathbb {R} ^{d}),\ \phi \bigoplus \psi \leq c}\int \phi \ d\mu +\int \psi d\nu =-P_{o}=\int |x|^{2}-2f_{*}(x)d\mu (x)+\int |x|^{2}-2f_{*}^{*}(x)d\nu (x)$ 
    (b) For any optimal transport plan  $\pi _{*}$ , we have that  $y\in \partial f_{*}(x)$   $\pi _{*}$ -almost-everywhere  $(x,y)$ .

(ii) Conversely, if $\pi \in \Pi (\mu ,\nu )$ and $f\in L^{1}(\mu )$ is proper, lower semi-continuous, and convex for which $y\in \partial f_{*}(x)$ $\pi$ -almost-everywhere $(x,y)$ , then

    (a)  $\pi$  is optimal
    (b)  $-P_{o}=\int |x|^{2}-f_{*}(x)d\mu (x)+\int |x|^{2}-2f_{*}^{*}(x)d\nu (x).$

References

↑ Villani, Cedric. Topics In Optimal Transportation. American Mathematical Soc., 2003.
↑ Mertens, Stephan. A New Approach to the Matching Problem. Physics, 7, 77. 21 July 2014.
↑ Carlier, Guillame. Optimal Transportation and Economic Applications. IMA. New Mathematical Models in Economics and Finance. 2010
↑ Paris, Quininio. An Economic Interpretations ofLinear Programming. Springer. 29 April 2016
↑ Craig, Katy. The Monge Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020
↑ Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020
↑ Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020

[1] Villani, Cedric. Topics In Optimal Transportation. American Mathematical Soc., 2003.

[2] Mertens, Stephan. A New Approach to the Matching Problem. Physics, 7, 77. 21 July 2014.

[3] Carlier, Guillame. Optimal Transportation and Economic Applications. IMA. New Mathematical Models in Economics and Finance. 2010

[4] Paris, Quininio. An Economic Interpretations ofLinear Programming. Springer. 29 April 2016

[5] Craig, Katy. The Monge Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020

[6] Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020

[7] Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 1: / Line 1: @@
-The Kantorovich problem <ref>"Villani" </ref> is one of the basic minimization problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport].  It is named after Russian mathematician and Nobel Laureate [https://en.wikipedia.org/wiki/Leonid_Kantorovich Leonid Kantorovich].
+The Kantorovich problem <ref>Villani, Cedric.  Topics In Optimal Transportation. American Mathematical Soc., 2003.</ref> is one of the two essential minimization problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport] (the other being the [http://34.106.105.83/wiki/Monge_Problem Monge problem]).  It is named after Russian mathematician and Nobel Laureate [https://en.wikipedia.org/wiki/Leonid_Kantorovich Leonid Kantorovich].
+[[File:Kantorovich Problem Image.png|300px|thumb|right|Kantorovich Problem.  The Kantorovich Problem (like the Monge Problem) can be visualized as distributing piles of sand into holes (of equal volume) in an optimal fashion. <ref>Mertens, Stephan.  A New Approach to the Matching Problem.  Physics, 7, 77.  21 July 2014. </ref> ]]
 ==Introduction==
-There are two basic problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport] the Monge problem and the Kantorovich problem.  In contrast to the Monge problem, The Kantorovich problem allows a non-empty minimization set, a convex constraint set, and a convex effort functional.  The Kantrovich problem admits a [https://en.wikipedia.org/wiki/Duality_(mathematics) dual] because it is a linear minimization problem with convex constraints.
+In contrast to the Monge problem, the Kantorovich problem involves working with a non-empty minimization set, a convex constraint set, and a convex effort functional.  The Kantrovich problem admits a [https://en.wikipedia.org/wiki/Duality_(mathematics) dual] because it is a linear minimization problem with convex constraints.
 ===Shipping problem===
-Suppose there is a merchant who is attempting to ship their items from one place to another.  They can hire trucks at some cost <math>c(x, y)</math> for each unit of merchandise which is shipped from point <math>x</math>to point <math>y</math>.  Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests.  This would be achieved by setting the price <math>\phi(x)</math> and <math>\phi(y)</math> such that the sum of <math>\phi(x)</math> and <math>\phi(y)</math> is always less than the cost <math>c(x, y)</math>. This may even involve setting negative prices in certain cases.  However, it can be shown that the shipper will spend almost as much as they would have if they had opted for the original pricing method.
+The intuition behind the Kantorovich problem can be given by an explanation of optimizing shipments.  Suppose there is a merchant who is attempting to ship items from one place to another.  The merchant can hire trucks at some cost <math>c(x, y)</math> for each unit of merchandise which is shipped from point <math>x </math>to point <math>y</math>.  Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests <ref>Carlier, Guillame.  Optimal Transportation and Economic Applications. IMA.  New Mathematical Models in Economics and Finance.  2010 </ref>.  This would be achieved by setting the price <math>\phi(x)</math> and <math>\phi(y)</math> such that the sum of <math>\phi(x)</math> and <math>\phi(y)</math> is always less than the cost <math>c(x, y)</math>. This may even involve setting negative prices in certain cases.  However, it can be shown that the shipper will spend almost as much as they would have if instead they opted for the original pricing method <ref>Paris, Quininio.  An Economic Interpretations ofLinear Programming. Springer. 29 April 2016 </ref>.
 ==Kantorovich Optimal Transport Problem==
-Consider the basic premises of optimal mass transportation.  Consider probability spaces <math>(X, \mu)</math> and <math>(Y, \nu)</math>.  Let <math>c</math> be a nonnegative measurable function on <math>X \times Y</math>.  The Kantorovich problem is the following:
+===Transport Plans===
-<math>
-\pi \longmapsto \int_{X \times Y} c(x, y) d \pi(x, y) (1)
-</math>
-This is on the convex set <math>\Pi(\mu, \nu)</math> which is also nonempty.  Note <math>\pi \in \Pi(\mu, \nu)</math> if and only if <math>\pi</math> is a nonnegative measure which satisfies:
+The Monge problem was about the optimal way to rearrange mass <ref> Craig, Katy. The Monge Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020 </ref>.  Note that in the Monge formulation of the optimal transport problem, the mass cannot be split and thus it is mapped <math>x \mapsto T(x)</math>.  When considering discrete cases, this results in problems when trying to establish maps T such that <math>T_{\#} \mu=\nu</math>.  Kantorovich made the observation that the mass in question could be split, which makes the problem much easier to model <ref> Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020 </ref>.  Allowing the mass to be split results in a relaxation of the problem (e.g. half of the mass from <math>x_1</math> can go to <math>y_1</math> and half can go to <math>y_2</math>, and so on).  To model this consider <math>d \pi(x, y)</math>, which denotes the mass transported from x to y.  This allows the mass to be moved to multiple places.  Also consider <math>\mu(A)</math> and <math>\nu(B)</math>: where the total mass taken from measurable set <math>A \in X</math> must be equal to <math>\mu(A)</math> and the total mass taken from measurable set <math>B \in Y</math> must equal <math>\nu(B)</math>.
+The constraints of the problem can be written in the following manner:
-<math>
+<div style="text-align: center;">
-\pi[A \times Y]=\mu[A], \quad \pi[X \times B]=\nu[B]
+<math>\pi(A \times Y)=\mu(A)</math>
-</math>
+<math> \quad \pi(X \times B)=\nu(B) \quad</math>
+</div>
-for all measurable subsets <math>A </math> of <math>X</math> and <math>B</math> of <math>Y</math>.  This definition implies that <math>\pi</math> is a probability measure.  Another way to say this is that <math>\pi \in \Pi(\mu, \nu)</math> if and only if it is a nonnegative measure on <math>X \times Y</math> such that, for all measurable functions <math>(\varphi, \psi) \in L^{1}(d \mu) \times L^{1}(d \nu),</math> or equivalently <math>L^{\infty}(d \mu) \times L^{\infty}(d \nu)</math>
+for all measurable sets <math>A \subseteq X, B \subseteq Y</math>. As such, we can interpret <math> \pi( A\times B) </math> as representing the amount of mass from <math> \mu (A)</math> that is directed to <math> \nu (B)</math>
+If we have a measure <math>\pi</math> that satisfies these constraints, then the set of such <math>\pi</math> is referred to as <math>\Pi(\mu, \nu)</math> -- the set of transport plans between <math>\mu</math> and <math>\nu</math>.  Notice again that now we are dealing with transport plans instead of the transport maps that are used in the Monge formulation of the problem <ref> Craig, Katy. The Kantorovich Problem. Math 260L. Univ. of Ca. at Santa Barbara. Spring 2020 </ref>.
-<math>\int_{X \times Y}[\varphi(x)+\psi(y)] d \pi(x, y)=\int_{X} \varphi d \mu+\int_{Y} \psi d \nu</math> (2)
+===Problem Statement===
+Given <math>\mu \in \mathcal{P}(X)</math> and <math>\nu \in \mathcal{P}(Y)</math>, solve
-===Remarks===
+<div style="text-align: center;">
+<math>\operatorname{min} \mathbb{K}(\pi):= \operatorname{min} \int_{X \times Y} c(x, y) \mathrm{d} \pi(x, y)</math>
+</div>
-There are some topological assumptions that can be made on the measure spaces <math>(X, \mu)</math> and <math> (Y, \nu)</math>. When <math>X</math> and <math>Y</math> are [https://en.wikipedia.org/wiki/Polish_space Polish spaces] (i.e. completely metrizable and separable spaces), and <math>\mu, \nu</math> are [https://en.wikipedia.org/wiki/Borel_measure Borel] probability measures, it is sufficient to impose the expression above for <math>(\varphi, \psi) \in C_{b}(X) \times C_{b}(Y)</math> only <ref>"Villani" </ref>.
+over all such <math>\pi \in \Pi(\mu, \nu)</math>
+Assuming there is a transport map <math>T^{\dagger}: X \rightarrow Y</math> for the Monge problem, we define <math>\mathrm{d} \pi(x, y)=\mathrm{d} \mu(x) \delta_{y=T^{\dagger}(x)}</math>.  Using this we can see that:
-In addition if <math>X</math> and <math>Y</math> are locally compact. then one can even be content with imposing (2) for <math>(\varphi, \psi) \in C_{0}(X) \times C_{0}(Y)</math>.<ref>"Villani" </ref>  Note that <math>C_{b}(X)</math> is the space of bounded continuous functions on <math>X,</math> and <math>C_{0}(X)</math> the space of continuous functions going to 0 at infinity, i.e. those continuous functions <math>\varphi</math> such that for any <math>\varepsilon>0</math> there is a compact set <math>K_{\varepsilon} \subset X</math> satisfying <math>\sup _{x \notin K_{e}}|\varphi(x)| \leq \varepsilon ;</math> note that<math>C_{0}(X) \subset C_{b}(X) .</math> This possibility to restrict the class of test functions to the narrower space <math>C_{0}</math> when <math>X</math> and <math>Y</math> are locally compact is due to Riesz' theorem, which identifies the space <math>M(X)</math> of Borel measures having finite total variation on <math>X</math> with the topological dual of <math>C_{0}(X)</math>.<ref>"Villani" </ref>
+<div style="text-align: center;">
+<math>\begin{aligned} \pi(A \times Y) &=\int_{A} \delta_{T^{\dagger}(x) \in Y} \mathrm{d} \mu(x)=\mu(A) \\ \pi(X \times B) &=\int_{X} \delta_{T^{\dagger}(x) \in B} \mathrm{d} \mu(x)=T_{\#}^{\dagger} \mu(B)=\nu(B) \end{aligned}</math>
+</div>
-==Kantorovich Duality==
+We can see that <math>\int_{X \times Y} c(x, y) \mathrm{d} \pi(x, y)=\int_{X} c\left(x, T^{\dagger}(x)\right) \mathrm{d} \mu(x)</math>
-Since the Kantorovich problem is a linear minimization problem with convex constraints it admits a dual.  Kantorovich expressed this in 1942, where he considered the case in which the cost function may be conceived of as a distance: function is a distance: <math> c(x, y)=d(x, y)</math>.
+thus <math>\inf \mathbb{K}(\pi) \leq \inf \mathbb{M}(T)</math>.
-===Theorem===
+===Kantorovich Duality===
-Let </math>X</math> and </math>Y</math> be Polish spaces, let </math>\mu \in P(X)</math> and </math>\nu \in P(Y),</math> and let </math>c: X \times Y \rightarrow \mathbb{R}_{+} \cup\{+\infty\}</math> be a lower
+Since the Kantorovich problem is a linear minimization problem with convex constraints it admits a [http://34.106.105.83/wiki/Kantorovich_Dual_Problem_(for_general_costs) dual problem].  The astute reader may notice that this is a linear programming problem -- Kantorovich is also considered to be the founder of linear programming.
-semi-continuous cost function.
+===Calculus of Variations Approach===
+Under the right setting, one can show the Kantovorich problem indeed has a minimizer using the direct method of the calculus of variations. More specifically, if one turns to the narrow topology, then it turns out that we get compactness of the constraint set. Moreover, such a topology ensures us that our objective function is lower semi-continuous.
-Whenever <math>\pi \in P(X \times Y)</math> and <math>(\varphi, \psi) \in L^{1}(d \mu) \times L^{1}(d \nu),</math> define
+===Knott-Smith Optimality Criterion===
+One useful result we have that allows us to connect both the Monge and Kantorovich problems is the so-called the Knott-Smith Optimality Criterion (see below).
-<math>
+==Statement:==
-I[\pi]=\int_{X \times Y} c(x, y) d \pi(x, y), \quad J(\varphi, \psi)=\int_{X} \varphi d \mu+\int_{Y} \psi d \nu
+Suppose <math> X\subset \mathbb{R}^d </math> is compact and <math> \mu, \nu \in \mathcal{P}(X)</math>. Let <math> c(x,y) = |x-y|^2 </math>. Then,
-</math>
-Define <math>\Pi(\mu, \nu)</math> to be the set of all Borel probability measures <math>\pi</math> on <math>X \times Y</math> such that for all measurable subsets <math>A \subset X</math> and <math>B \subset Y</math>.
+(i) There exists <math> f_* \in L^1(\mu) </math> proper, lower semi-continuous, and convex such that
-<math>
+     (a) <math> \sup_{\phi, \psi \in C_{b}(\mathbb{R}^d),\ \phi \bigoplus \psi \leq c} \int \phi\ d\mu + \int \psi d\nu = -P_o = \int |x|^2 - 2f_*(x) d\mu(x) + \int |x|^2 - 2f_*^*(x) d\nu(x)</math>
-\pi[A \times Y]=\mu[A], \quad \pi[X \times B]=\nu[B]
+     (b) For any optimal transport plan <math> \pi_* </math>, we have that <math> y\in \partial f_*(x)</math> <math>\pi_*</math>-almost-everywhere <math> (x,y)</math>.
-</math>
-and define <math>\Phi_{c}</math> to be the set of all measurable functions <math>(\varphi, \psi) \in L^{1}(d \mu) \times</math> <math>L^{1}(d \nu)</math> satisfying
+(ii) Conversely, if <math> \pi \in \Pi(\mu, \nu)</math> and <math> f\in L^1(\mu)</math> is proper, lower semi-continuous, and convex for which <math> y\in \partial f_*(x) </math> <math> \pi </math>-almost-everywhere <math>(x,y)</math>, then
-<math>
+     (a) <math> \pi </math> is optimal
-\varphi(x)+\psi(y) \leq c(x, y)
+     (b) <math> -P_o = \int|x|^2-f_*(x)d\mu(x) + \int |x|^2-2f_*^*(x)d\nu(x).</math>
-</math>
-for d <math>\mu</math>-almost all <math>x \in X, d \nu</math> -almost all <math>y \in Y</math> (that is to say, for all <math>(x, y)</math> outside of a <math>(\mu \otimes \nu)</math> negligible set.
+===References===

Kantorovich Problem: Difference between revisions

Latest revision as of 06:15, 17 March 2022

Contents

Introduction

Shipping problem

Kantorovich Optimal Transport Problem

Transport Plans

Problem Statement

Kantorovich Duality

Calculus of Variations Approach

Knott-Smith Optimality Criterion

Statement:

References

Navigation menu

Kantorovich Problem: Difference between revisions

Latest revision as of 06:15, 17 March 2022

Introduction

Shipping problem

Kantorovich Optimal Transport Problem

Transport Plans

Problem Statement

Kantorovich Duality

Calculus of Variations Approach

Knott-Smith Optimality Criterion

Statement:

References

Navigation menu

Search