Sliced Wasserstein Distance: Difference between revisions

From Optimal Transport Wiki
Jump to navigation Jump to search
(Created page with "The sliced Wasserstein distance <math> SW_2</math> is an alternative distance between probability measures which enjoys many of the same properties as the [https://en.wikipedi...")
 
No edit summary
Line 14: Line 14:
The sliced Wasserstein distance satisfies all the axioms of a true metric on <math>\mathcal P_2(\mathbb R^d)</math>. The triangle inequality is inherited from <math>W_2</math> and <math>L^2</math>, and the positivity and symmetry of <math>W_2</math> yields the positivity and symmetry of <math>SW_2</math>. The tricky part lies in showing that <math>SW_2(\mu,\nu)=0</math> implies <math>\mu=\nu</math>. Note that if <math>SW_2(\mu,\nu)=0</math> then <math>P_{\theta\#}\mu=P_{\theta\#}\nu</math>. One can go from that observation to the conclusion that <math>\mu=\nu</math> by appealing to the theory of [https://en.wikipedia.org/wiki/Radon_transform Radon transforms].  
The sliced Wasserstein distance satisfies all the axioms of a true metric on <math>\mathcal P_2(\mathbb R^d)</math>. The triangle inequality is inherited from <math>W_2</math> and <math>L^2</math>, and the positivity and symmetry of <math>W_2</math> yields the positivity and symmetry of <math>SW_2</math>. The tricky part lies in showing that <math>SW_2(\mu,\nu)=0</math> implies <math>\mu=\nu</math>. Note that if <math>SW_2(\mu,\nu)=0</math> then <math>P_{\theta\#}\mu=P_{\theta\#}\nu</math>. One can go from that observation to the conclusion that <math>\mu=\nu</math> by appealing to the theory of [https://en.wikipedia.org/wiki/Radon_transform Radon transforms].  


It turns out that <math>W_2(P_{\theta\#}\mu,P_{\theta\#}\nu)\leq W_2(\mu,\nu)</math> (i.e. <math>P_{\theta\#}</math> is 1-Lipschitz). This implies that <math>SW_2(\mu,\nu)\leq W_2(\mu,\nu)</math> which means that the identity map on <math>\mathcal P_2(\mathbb R^d)</math> is <math>W_2</math>-to-<math>SW_2</math>-continuous. Moreover, if restrict our domain to a compact <math>\Omega\subseteq\mathbb R^d</math> we have that <math>(\mathcal P_2(\Omega), W_2)</math> is itself compact and so the identity map is now a continuous bijection from a compact space to a Hausdorff space and so it must be a homeomorphism. This shows that on compact domains <math>SW_2</math> is just as good as <math>W_2</math> from a topological standpoint.  
It turns out that <math>W_2(P_{\theta\#}\mu,P_{\theta\#}\nu)\leq W_2(\mu,\nu)</math> (i.e. <math>P_{\theta\#}</math> is 1-Lipschitz). This implies that <math>SW_2(\mu,\nu)\leq W_2(\mu,\nu)</math> which means that the identity map on <math>\mathcal P_2(\mathbb R^d)</math> is <math>W_2</math>-to-<math>SW_2</math>-continuous. Moreover, if we restrict our domain to a compact <math>\Omega\subseteq\mathbb R^d</math> we have that <math>(\mathcal P_2(\Omega), W_2)</math> is itself compact and so the identity map is now a continuous bijection from a compact space to a Hausdorff space and so it must be a homeomorphism. This shows that on compact domains <math>SW_2</math> is just as good as <math>W_2</math> from a topological standpoint.  


==Computation==  
==Computation==  

Revision as of 11:31, 10 June 2020

The sliced Wasserstein distance is an alternative distance between probability measures which enjoys many of the same properties as the Wasserstein distance. For further reading see Santambrogio (pg. 214-215) [1] and Peyré & Cuturi (pg. 166-169)[2].

Motivation

One situation in which the Wasserstein distance is easier to compute is the 1D case. In particular, if the the measures are of the form and where and then the Wasserstein distance is given by (Peyré & Cuturi pg. 30 [2]). The simplicity of the 1D case provokes one to consider whether a Wasserstein-like distance over could be built from knowledge of the Wasserstein distance along projections onto 1D axes. The sliced Wasserstein distance provides an affirmative answer.

Definition

Let be the projection onto a unit vector i.e. . The sliced Wasserstein distance on is given by

Here the integral over is with respect to the surface measure on .

Properties

The sliced Wasserstein distance satisfies all the axioms of a true metric on . The triangle inequality is inherited from and , and the positivity and symmetry of yields the positivity and symmetry of . The tricky part lies in showing that implies . Note that if then . One can go from that observation to the conclusion that by appealing to the theory of Radon transforms.

It turns out that (i.e. is 1-Lipschitz). This implies that which means that the identity map on is -to--continuous. Moreover, if we restrict our domain to a compact we have that is itself compact and so the identity map is now a continuous bijection from a compact space to a Hausdorff space and so it must be a homeomorphism. This shows that on compact domains is just as good as from a topological standpoint.

Computation

To estimate the computation involved in , one can discretize the sphere and carry out the requisite 1D Wasserstein distance computations. As mentioned in the motivation section, 1D Wasserstein distances are significantly simpler to compute. This is especially so in the case of empirical measures of equally sized support. For further details on how to compute , see Peyré & Cuturi (pg. 166-169)[2].


References

  1. [Santambrogio, Filippo. "Optimal Transport for Applied Mathematicians" (2015)]
  2. 2.0 2.1 2.2 Peyré, Gabriel & Cuturi, Marco. "Computational Optimal Transport" (2018)