Kantorovich Problem: Difference between revisions

Revision as of 03:56, 8 June 2020

The Kantorovich problem ^[1] is one of the two essential minimization problems in optimal transport (the other being the Monge problem). It is named after Russian mathematician and Nobel Laureate Leonid Kantorovich.

Kantorovich Problem. The Kantorovich Problem (like the Monge Problem) can be visualized as distributing piles of sand into holes (of equal volume) in an optimal fashion. ^[2]

Introduction

In contrast to the Monge problem, The Kantorovich problem allows a non-empty minimization set, a convex constraint set, and a convex effort functional. The Kantrovich problem admits a dual because it is a linear minimization problem with convex constraints.

Shipping problem

The intuition behind the Kantorovich problem can be given by an explanation of optimizing shipments. Suppose there is a merchant who is attempting to ship items from one place to another. She can hire trucks at some cost $c(x,y)$ for each unit of merchandise which is shipped from point $x$ to point $y$ . Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests. This would be achieved by setting the price $\phi (x)$ and $\phi (y)$ such that the sum of $\phi (x)$ and $\phi (y)$ is always less than the cost $c(x,y)$ . This may even involve setting negative prices in certain cases. However, it can be shown that the shipper will spend almost as much as she would have if she had opted for the original pricing method.

Kantorovich Optimal Transport Problem

Transport Plans

Recall from the discussion of the Monge problem that the original question was how to rearrange mass. Note that in the Monge formulation of the optimal transport problem, the mass cannot be split and thus it is mapped $x\mapsto T(x)$ . When considering discrete cases this results in problems when trying to establish maps T such that $T_{\#}\mu =\nu$ . Kantorovich made the observation that the mass in question could be split, which makes the problem much easier to model. Allowing the mass to be split results in a relaxation of the problem (e.g. half of the mass from $x_{1}$ can go to $y_{1}$ and half can go to $y_{2}$ , and so on). To model this consider $d\pi (x,y)$ , which denotes the mass transported from x to y. This allows the mass to be moved to multiple places. Also consider $\mu (A)$ and $\nu (B)$ : where the total mass taken from measurable set $A\in X$ must be equal to $\mu (A)$ and the total mass taken from measurable set $B\in Y$ must equal $\nu (B)$ .

The constraints of the problem can be written in the following manner:

$\pi (A\times Y)=\mu (A)$ $\quad \pi (X\times B)=\nu (B)\quad$

for all measurable sets $A\subseteq X,B\subseteq Y$

If we have a $\pi$ that satisfies these constraints, the set of such $\pi$ is referred to as $\Pi (\mu ,\nu )$ . This is the set of transport plans between $\mu$ and $\nu$ .

Problem Statement

given $\mu \in {\mathcal {P}}(X)$ and $\nu \in {\mathcal {P}}(Y)$

$\operatorname {min} \mathbb {K} (\pi )=\int _{X\times Y}c(x,y)\mathrm {d} \pi (x,y)$

over $\pi \in \Pi (\mu ,\nu )$

Assuming there is a transport map $T^{\dagger }:X\rightarrow Y$ for the Monge problem. Now define $\mathrm {d} \pi (x,y)=\mathrm {d} \mu (x)\delta _{y=T^{\dagger }(x)}$ . Using this we can see that:

${\begin{aligned}\pi (A\times Y)&=\int _{A}\delta _{T^{\dagger }(x)\in Y}\mathrm {d} \mu (x)=\mu (A)\\\pi (X\times B)&=\int _{X}\delta _{T^{\dagger }(x)\in B}\mathrm {d} \mu (x)=\mu \left(\left(T^{\dagger }\right)^{-1}(B)\right)=T_{\#}^{\dagger }\mu (B)=\nu (B)\end{aligned}}$

Kantorovich Duality

Since the Kantorovich problem is a linear minimization problem with convex constraints it admits a dual. Kantorovich expressed this in 1942, where he considered the case in which the cost function may be conceived of as a distance: function is a distance: $c(x,y)=d(x,y)$ .

↑ Villani, Cedric. Topics In Optimal Transportation. American Mathematical Soc., 2003.
↑ https://physics.aps.org/articles/v7/77

[1] Villani, Cedric. Topics In Optimal Transportation. American Mathematical Soc., 2003.

[2] ttps://physics.aps.org/articles/v7/77

[1]

[2]

@@ Line 1: / Line 1: @@
-The Kantorovich problem <ref>Villani, Cedric.  Topics In Optimal Transportation. American Mathematical Soc., 2003.</ref> is one of the basic minimization problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport].  It is named after Russian mathematician and Nobel Laureate [https://en.wikipedia.org/wiki/Leonid_Kantorovich Leonid Kantorovich].
+The Kantorovich problem <ref>Villani, Cedric.  Topics In Optimal Transportation. American Mathematical Soc., 2003.</ref> is one of the two essential minimization problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport] (the other being the [http://34.106.105.83/wiki/Monge_Problem Monge problem]).  It is named after Russian mathematician and Nobel Laureate [https://en.wikipedia.org/wiki/Leonid_Kantorovich Leonid Kantorovich].
 [[File:Kantorovich Problem Image.png|300px|thumb|right|Kantorovich Problem.  The Kantorovich Problem (like the Monge Problem) can be visualized as distributing piles of sand into holes (of equal volume) in an optimal fashion. <ref>https://physics.aps.org/articles/v7/77</ref> ]]
@@ Line 5: / Line 5: @@
 ==Introduction==
-There are two basic problems in [https://en.wikipedia.org/wiki/Transportation_theory_(mathematics) optimal transport] the [http://34.106.105.83/wiki/Monge_Problem Monge problem] and the Kantorovich problem.  In contrast to the Monge problem, The Kantorovich problem allows a non-empty minimization set, a convex constraint set, and a convex effort functional.  The Kantrovich problem admits a [https://en.wikipedia.org/wiki/Duality_(mathematics) dual] because it is a linear minimization problem with convex constraints.
+In contrast to the Monge problem, The Kantorovich problem allows a non-empty minimization set, a convex constraint set, and a convex effort functional.  The Kantrovich problem admits a [https://en.wikipedia.org/wiki/Duality_(mathematics) dual] because it is a linear minimization problem with convex constraints.
 ===Shipping problem===
-Suppose there is a merchant who is attempting to ship their items from one place to another.  They can hire trucks at some cost <math>c(x, y)</math> for each unit of merchandise which is shipped from point <math>x</math>to point <math>y</math>.  Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests.  This would be achieved by setting the price <math>\phi(x)</math> and <math>\phi(y)</math> such that the sum of <math>\phi(x)</math> and <math>\phi(y)</math> is always less than the cost <math>c(x, y)</math>. This may even involve setting negative prices in certain cases.  However, it can be shown that the shipper will spend almost as much as they would have if they had opted for the original pricing method.
+The intuition behind the Kantorovich problem can be given by an explanation of optimizing shipments.  Suppose there is a merchant who is attempting to ship items from one place to another.  She can hire trucks at some cost <math>c(x, y)</math> for each unit of merchandise which is shipped from point <math>x</math>to point <math>y</math>.  Now the shipper is approached by a mathematician, who claims that prices can be set such that they align with the shipper's financial interests.  This would be achieved by setting the price <math>\phi(x)</math> and <math>\phi(y)</math> such that the sum of <math>\phi(x)</math> and <math>\phi(y)</math> is always less than the cost <math>c(x, y)</math>. This may even involve setting negative prices in certain cases.  However, it can be shown that the shipper will spend almost as much as she would have if she had opted for the original pricing method.
 ==Kantorovich Optimal Transport Problem==
-Consider the basic premises of optimal mass transportation.  Consider probability spaces <math>(X, \mu)</math> and <math>(Y, \nu)</math>.  Let <math>c</math> be a nonnegative measurable function on <math>X \times Y</math>.  The Kantorovich problem is the following:
+===Transport Plans===
+Recall from the discussion of the Monge problem that the original question was how to rearrange mass.  Note that in the Monge formulation of the optimal transport problem, the mass cannot be split and thus it is mapped <math>x \mapsto T(x)</math>.  When considering discrete cases this results in problems when trying to establish maps T such that <math>T_{\#} \mu=\nu</math>.  Kantorovich made the observation that the mass in question could be split, which makes the problem much easier to model.  Allowing the mass to be split results in a relaxation of the problem (e.g. half of the mass from <math>x_1</math> can go to <math>y_1</math> and half can go to <math>y_2</math>, and so on).  To model this consider <math>d \pi(x, y)</math>, which denotes the mass transported from x to y.  This allows the mass to be moved to multiple places.  Also consider <math>\mu(A)</math> and <math>\nu(B)</math>: where the total mass taken from measurable set <math>A \in X</math> must be equal to <math>\mu(A)</math> and the total mass taken from measurable set <math>B \in Y</math> must equal <math>\nu(B)</math>.
-<math>
+The constraints of the problem can be written in the following manner:
-\pi \longmapsto \int_{X \times Y} c(x, y) d \pi(x, y) (1)
+<math>\pi(A \times Y)=\mu(A)</math>
+<math> \quad \pi(X \times B)=\nu(B) \quad</math>
-</math>
+for all measurable sets <math>A \subseteq X, B \subseteq Y</math>
+If we have a <math>\pi</math> that satisfies these constraints, the set of such <math>\pi</math> is referred to as <math>\Pi(\mu, \nu)</math>.  This is the set of transport plans between <math>\mu</math> and <math>\nu</math>.
-This is on the convex set <math>\Pi(\mu, \nu)</math> which is also nonempty.  Note <math>\pi \in \Pi(\mu, \nu)</math> if and only if <math>\pi</math> is a nonnegative measure which satisfies:
+==Problem Statement==
+given <math>\mu \in \mathcal{P}(X)</math> and <math>\nu \in \mathcal{P}(Y)</math>
-<math>
+<math>\operatorname{min} \mathbb{K}(\pi)=\int_{X \times Y} c(x, y) \mathrm{d} \pi(x, y)</math>
-\pi[A \times Y]=\mu[A], \quad \pi[X \times B]=\nu[B]
-</math>
+over <math>\pi \in \Pi(\mu, \nu)</math>
-for all measurable subsets <math>A </math> of <math>X</math> and <math>B</math> of <math>Y</math>.  This definition implies that <math>\pi</math> is a probability measure.  Another way to say this is that <math>\pi \in \Pi(\mu, \nu)</math> if and only if it is a nonnegative measure on <math>X \times Y</math> such that, for all measurable functions <math>(\varphi, \psi) \in L^{1}(d \mu) \times L^{1}(d \nu),</math> or equivalently <math>L^{\infty}(d \mu) \times L^{\infty}(d \nu)</math>
+Assuming there is a transport map <math>T^{\dagger}: X \rightarrow Y</math> for the Monge problem.  Now define <math>\mathrm{d} \pi(x, y)=\mathrm{d} \mu(x) \delta_{y=T^{\dagger}(x)}</math>.  Using this we can see that:
+<math>\begin{aligned} \pi(A \times Y) &=\int_{A} \delta_{T^{\dagger}(x) \in Y} \mathrm{d} \mu(x)=\mu(A) \\ \pi(X \times B) &=\int_{X} \delta_{T^{\dagger}(x) \in B} \mathrm{d} \mu(x)=\mu\left(\left(T^{\dagger}\right)^{-1}(B)\right)=T_{\#}^{\dagger} \mu(B)=\nu(B) \end{aligned}</math>
-<math>\int_{X \times Y}[\varphi(x)+\psi(y)] d \pi(x, y)=\int_{X} \varphi d \mu+\int_{Y} \psi d \nu</math> (2)
-===Remarks===
-There are some topological assumptions that can be made on the measure spaces <math>(X, \mu)</math> and <math> (Y, \nu)</math>. When <math>X</math> and <math>Y</math> are [https://en.wikipedia.org/wiki/Polish_space Polish spaces] (i.e. completely metrizable and separable spaces), and <math>\mu, \nu</math> are [https://en.wikipedia.org/wiki/Borel_measure Borel] probability measures, it is sufficient to impose the expression above for <math>(\varphi, \psi) \in C_{b}(X) \times C_{b}(Y)</math> only <ref>Villani, Cedric.</ref>.
-In addition if <math>X</math> and <math>Y</math> are locally compact. then one can even be content with imposing (2) for <math>(\varphi, \psi) \in C_{0}(X) \times C_{0}(Y)</math>.<ref>Villani, Cedric. </ref>  Note that <math>C_{b}(X)</math> is the space of bounded continuous functions on <math>X,</math> and <math>C_{0}(X)</math> the space of continuous functions going to 0 at infinity, i.e. those continuous functions <math>\varphi</math> such that for any <math>\varepsilon>0</math> there is a compact set <math>K_{\varepsilon} \subset X</math> satisfying <math>\sup _{x \notin K_{e}}|\varphi(x)| \leq \varepsilon ;</math> note that<math>C_{0}(X) \subset C_{b}(X) .</math> This possibility to restrict the class of test functions to the narrower space <math>C_{0}</math> when <math>X</math> and <math>Y</math> are locally compact is due to Riesz' theorem, which identifies the space <math>M(X)</math> of Borel measures having finite total variation on <math>X</math> with the topological dual of <math>C_{0}(X)</math>.<ref>Villani, Cedric. </ref>
 ==Kantorovich Duality==
 Since the Kantorovich problem is a linear minimization problem with convex constraints it admits a dual.  Kantorovich expressed this in 1942, where he considered the case in which the cost function may be conceived of as a distance: function is a distance: <math> c(x, y)=d(x, y)</math>.
-===Theorem===
-Let <math>X</math> and <math>Y</math> be Polish spaces, let <math>\mu \in P(X)</math> and <math>\nu \in P(Y),</math> and let <math>c: X \times Y \rightarrow \mathbb{R}_{+} \cup\{+\infty\}</math> be a lower
-semi-continuous cost function.
-Whenever <math>\pi \in P(X \times Y)</math> and <math>(\varphi, \psi) \in L^{1}(d \mu) \times L^{1}(d \nu),</math> define
-<math>
-I[\pi]=\int_{X \times Y} c(x, y) d \pi(x, y), \quad J(\varphi, \psi)=\int_{X} \varphi d \mu+\int_{Y} \psi d \nu
-</math>
-Define <math>\Pi(\mu, \nu)</math> to be the set of all Borel probability measures <math>\pi</math> on <math>X \times Y</math> such that for all measurable subsets <math>A \subset X</math> and <math>B \subset Y</math>.
-<math>
-\pi[A \times Y]=\mu[A], \quad \pi[X \times B]=\nu[B]
-</math>
-and define <math>\Phi_{c}</math> to be the set of all measurable functions <math>(\varphi, \psi) \in L^{1}(d \mu) \times</math> <math>L^{1}(d \nu)</math> satisfying
-<math>
-\varphi(x)+\psi(y) \leq c(x, y)
-</math>
-for d <math>\mu</math>-almost all <math>x \in X, d \nu</math> -almost all <math>y \in Y</math> ( <math>\forall (x, y)</math> outside of <math>(\mu \otimes \nu)</math> negligible set.
-Then
-<math> \inf _{\Pi(\mu, \nu)} I[\pi]=\sup _{\Phi_{e}} J(\varphi, \psi) </math>
-If the definition of <math>\Phi_{c}</math> to those functions <math>(\varphi, \psi)</math> that are continuous and bounded, then the supremum on the right hand side of (3) is not affected.

Kantorovich Problem: Difference between revisions

Revision as of 03:56, 8 June 2020

Contents

Introduction

Shipping problem

Kantorovich Optimal Transport Problem

Transport Plans

Problem Statement

Kantorovich Duality

Navigation menu

Kantorovich Problem: Difference between revisions

Revision as of 03:56, 8 June 2020

Introduction

Shipping problem

Kantorovich Optimal Transport Problem

Transport Plans

Problem Statement

Kantorovich Duality

Navigation menu

Search