If the measures
are both finite convex combinations of Dirac masses, the Kantorovich and Monge problems may be stated as linear programs[1], and solved using classical methods[2].
This insight may be used to approximate the Wasserstein distances between general measures, by first discretizing the source and target with a known level of accuracy (a difficult problem!), then computing the cost between the discrete measures.
Statement of the Discrete Problems
We suppose that
and
(assuming the weights
and
are all positive, the
and
are distinct, and
).
Then any transport plan
for scalars
(if it charged anything other than the product of the supports of
and
, it would not have the correct marginals).
For notation, set
,
,
,
(or more generally
for generic cost functions), and denote by
the vector of all ones of dimension
.
Essentially, we are replacing the space
with
and passing from measures and functions to vectors and matrices.
Primal Problems
With all this notation, we can say the collection of transport plans is analogous to
data:image/s3,"s3://crabby-images/087be/087be7a690fbaad745c170099421681cba4a189e" alt="{\displaystyle \Gamma (\mu ,\nu )\cong {\mathcal {G}}(m,n):=\left\{G\in {\mathcal {M}}_{M\times N}:g_{ij}\geq 0,G\mathbf {1} _{N}=m,G^{\top }\mathbf {1} _{M}=n\right\},}"
and the analogue of the (original)
Kantorovich Problem is
data:image/s3,"s3://crabby-images/cd7e2/cd7e2bad17a4f78d8c48319029576f5e38627533" alt="{\displaystyle \inf _{\gamma \in \Gamma (\mu ,\nu )}\int \lVert x-y\rVert ^{2}\,d\gamma (x,y)\cong \inf _{G\in {\mathcal {G}}(m,n)}\sum _{i=1}^{M}\sum _{j=1}^{N}c_{ij}g_{ij}=\inf _{G\in {\mathcal {G}}(m,n)}{\text{tr}}\left(C^{\top }G\right)=\inf _{G\in {\mathcal {G}}(m,n)}C:G}"
(where
data:image/s3,"s3://crabby-images/9eac0/9eac054eb099f14717c2c60179b5967532c4a17e" alt="{\displaystyle C:G}"
is just shorthand for
data:image/s3,"s3://crabby-images/847d8/847d828fec564baeca57831fe0494442fd8c2fbc" alt="{\displaystyle {\text{tr}}\left(C^{\top }G\right)}"
, an inner product on matrices).
We could define a transport map as a function
, but it will be more convenient to define it as a matrix in
, subject to constraints (erasing the difference between a transport map and the transport plan it defines).
We can say the collection of transport maps is
data:image/s3,"s3://crabby-images/34b6e/34b6e531cb21376c9b79fac91457d229d010dd2b" alt="{\displaystyle {\mathcal {T}}(m,n):=\left\{T\in {\mathcal {M}}_{M\times N}:t_{ij}\in \left\{0,1\right\},T\mathbf {1} _{N}=m,T^{\top }\mathbf {1} _{M}=n\right\}\subseteq {\mathcal {G}}(m,n).}"
That is, a transport map corresponds to a plan which does not split mass, but instead sends all mass at
data:image/s3,"s3://crabby-images/26e69/26e698feb985eb1c2f5bf24a016b5936c44de86e" alt="{\displaystyle x_{i}}"
to the
data:image/s3,"s3://crabby-images/56692/56692ef99dd76820c6d62cec9c158fa720108585" alt="{\displaystyle y_{j}}"
such that
data:image/s3,"s3://crabby-images/4c877/4c877534874c05aa847c45aa3206a18f656290f1" alt="{\displaystyle t_{ij}=1}"
(since
data:image/s3,"s3://crabby-images/229f7/229f773f7eb574d1b15566538cdbf56ad846d9d4" alt="{\displaystyle m}"
and
data:image/s3,"s3://crabby-images/2472f/2472f0ae3b9a8a2697e7860592944db01c0f332a" alt="{\displaystyle n}"
correspond to probability measures—that is,
data:image/s3,"s3://crabby-images/9e52a/9e52acc12b5417e945c9b4ccbb685b876c9cd2e6" alt="{\displaystyle \mathbf {1} _{M}^{\top }m=1=\mathbf {1} _{N}^{\top }n}"
—this is unique for a given
data:image/s3,"s3://crabby-images/56268/56268398b168098ce256f2e55d6fe130e10acba7" alt="{\displaystyle T}"
).
Then the analogue of the
Monge Problem is
data:image/s3,"s3://crabby-images/4d0bb/4d0bbab8fe934a839a8b299bd941ccda3d47de9f" alt="{\displaystyle \inf _{T\in {\mathcal {T}}(m,n)}C:T.}"
As in the general case, this may fail to exist; in fact,
data:image/s3,"s3://crabby-images/57eda/57eda954141d7543de1b4a41e16c2a5b2a7a4e2d" alt="{\displaystyle {\mathcal {T}}(m,n)}"
may be empty, for example if
data:image/s3,"s3://crabby-images/1ac2a/1ac2af10ed065661601463ea6025aecaf598c6c3" alt="{\displaystyle M<N}"
.
Dual Problem
This also has an analogue.
Useful Combinatorial Structure
The set
is a closed polytope in
, and since
(resp.
) is a linear functional bounded below by zero on
, it is quite classical[3] (perhaps taking a compact set formed by restricting to those plans with
for some fixed
) that its minimum is obtained at an extreme point.
Algorithms
Simplex Algorithm
This is also very classical.
Improved Algorithms
There are other, improved methods with their own articles, like the Auction Algorithm and Sinkhorn's Algorithm.
Perhaps short descriptions/groupings of them should be put on this page to collect the information?
References