The continuity equation and Benamour Brenier formula: Difference between revisions

From Optimal Transport Wiki
Jump to navigation Jump to search
 
(70 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Introduction ==
== Introduction ==


The continuity equation is an important equation in many fields of science, for example, electromagnetism, computer vision, fluid dynamics etc. However, in the field of optimal transport, the formulation from fluid dynamics is of a large significance. This form helps to explain the dynamic formulation of special cases of [https://en.wikipedia.org/wiki/Wasserstein_metric Wasserstein metric], and we will focus in this direction. For more general information about the continuity equation, look at the article [https://en.wikipedia.org/wiki/Continuity_equation Continuity equation].
The continuity equation is an important equation in many fields of science, for example, electromagnetism, computer vision, fluid dynamics etc. However, in the field of optimal transport, the formulation from fluid dynamics is of a large significance. This form helps to explain the dynamic formulation of special cases of [https://en.wikipedia.org/wiki/Wasserstein_metric Wasserstein metric] via the continuity equation, and we will focus in this direction. Related to this is the Benamou-Brenier Formula, which implies a Riemannian structure on our space of measures. For more general information about the continuity equation, look at the article [https://en.wikipedia.org/wiki/Continuity_equation Continuity equation].


== Continuity equation in fluid dynamics ==
== Continuity equation in fluid dynamics ==
Line 8: Line 8:


Suppose that mass of our fluid is conserved, through time. Denote <math> \rho(x,t) </math> as a density function, representing the mass-density of fluid, and <math> v(x,t) </math> as a velocity of particle at position <math> x </math>, at time <math> t </math>. Then, for any subspace <math> W </math> of <math> \mathbb{R}^{3} </math> we have:  
Suppose that mass of our fluid is conserved, through time. Denote <math> \rho(x,t) </math> as a density function, representing the mass-density of fluid, and <math> v(x,t) </math> as a velocity of particle at position <math> x </math>, at time <math> t </math>. Then, for any subspace <math> W </math> of <math> \mathbb{R}^{3} </math> we have:  
    <math> \partial_{t}[\int_{W} \rho(x,t) dV] = - \int_{\partial W} \rho v \cdot ndS. </math>
:<math> \partial_{t}[\int_{W} \rho(x,t) dV] = - \int_{\partial W} \rho v \cdot ndS. </math>


In this section, we assume both density function and particle velocity are smooth enough. Hence, after differentiating under the integral and applying the Divergence Theorem, we get:  
In this section, we assume both density function and particle velocity are smooth enough. Hence, after differentiating under the integral and applying the Divergence Theorem, we get:  
      <math> \int_{W} \partial_{t}\rho(x,t) dV = - \int_{W} \nabla\cdot(\rho v) dV. </math>
:<math> \int_{W} \partial_{t}\rho(x,t) dV = - \int_{W} \nabla\cdot(\rho v) dV. </math>


Finally, we conclude that:
Finally, we conclude that:
      <math> \int_{W} \partial_{t}\rho + \nabla\cdot(\rho v) dV = 0, </math>
:<math> \int_{W} \partial_{t}\rho + \nabla\cdot(\rho v) dV = 0, </math>
which implies, since <math> W </math> is arbitrary, that:
which implies, since <math> W </math> is arbitrary, that:
      <math> \partial_{t}\rho + \nabla\cdot(\rho v) = 0. </math>
:<math> \partial_{t}\rho + \nabla\cdot(\rho v) = 0. </math>
The last equation is the continuity equation in fluid dynamics, written in the differential form. We use the equation in this form in optimal transport.
The last equation is the continuity equation in fluid dynamics, written in the differential form. We use the equation in this form in optimal transport.
An important perspective comes from viewing the fluid as an system of particles moving in space (Langrangian perspective), as opposed to some continuous density that varies at specific points (Eulerian perspective) due to internal currents. This alternate perspective can be formalized via an ordinary differential equation<ref name="Ambrosio" />.
:<math> \begin{cases}
\gamma'(t) = v(\gamma(t), t) & \\
\gamma(0) = x &
\end{cases} </math>
where <math> v(x,t) </math> is some vector field varying with time, and <math> \gamma(t) </math> is absolutely continuous. Since we are not assuming our curve is differentiable, we can instead consider the integral form of our ODE
:<math> \gamma(t) = x + \int_0^t v_s(\gamma(s)) \, ds </math>
In both cases, <math> x </math> represents the starting point of our curve, which indicates the path of a specific particle in our fluid. The important fact is that the continuity equation and our ODE system are equivalent in a weak sense when given the same <math> v(x, t) </math>. Intuitively, the continuity equation is stating the the change in density at a point is dictated by the flow of particles, <math> \rho \cdot v(x,t) </math> into that point. From the Lagrangian point of view, this flow is the result of all the trajectories of particles that follow the current <math> v(x,t) </math> into the desired point.


== Continuity equation in optimal transport ==
== Continuity equation in optimal transport ==
Line 23: Line 32:
The previous discussion assumed that the density function was smooth, which is not true of the general measures we consider in optimal transport. Even when a measure <math> \mu </math> is absolutely continuous with respect to Lebesgue measure, which we write with a mild abuse of notation as <math> d\mu(x) = \mu(x) dx </math>, <math> \mu </math> does not have to be smooth. So, we need to state a proper weak formulation of the continuity equation. Smooth functions satisfy all the cases below.   
The previous discussion assumed that the density function was smooth, which is not true of the general measures we consider in optimal transport. Even when a measure <math> \mu </math> is absolutely continuous with respect to Lebesgue measure, which we write with a mild abuse of notation as <math> d\mu(x) = \mu(x) dx </math>, <math> \mu </math> does not have to be smooth. So, we need to state a proper weak formulation of the continuity equation. Smooth functions satisfy all the cases below.   


Sometimes in the literature, authors use continuity equation, and transport equation as synonyms. On the other hand, in the optimal transport we differentiate these two and the standard Cauchy problem. Here, we will present definitions and reasoning from book by  F.Santambrogio<ref name="Santambrogio" />.
Here, we will present definitions and reasoning from book by  F.Santambrogio<ref name="Santambrogio" />.


From this point, we are looking at the following equation:  
From this point, we are looking at the following equation:  
<math> \partial_{t}\mu_{t} + \nabla\cdot (\mu_{t}v_{t}) = 0 </math>, and various notions of its solutions.
:<math> \partial_{t}\mu_{t} + \nabla\cdot (\mu_{t}v_{t}) = 0. </math>
 
We will give two different notions of solutions to the continuity equation.
 
:* '''Distributional solution.''' All the measures we are interested in satisfy <math> \int_{0}^{1} ||v_{t}||_{L^{1}(\mu_{t})}dt < \infty </math>, and solve continuity equation in a distributional sense, namely
:<math> \int_{0}^{T}\int_{\Omega} [\partial_{t}\phi + \nabla\phi\cdot v_{t}] d\mu_{t} dt = 0, </math> for all bounded Lipschitz functions <math> \phi \in C_{c}^{1}((0,T) \times \overline{\Omega}) </math>, where <math> \Omega </math> is a bounded domain or the whole space <math> \mathbb{R}^{d}</math>, and <math> 0<T<1 </math>. We assume no-flux condition in this case, namely <math> \mu_{t}v_{t} \cdot n = 0 </math> on the boundary <math> \partial\Omega. </math> This notion of solution is called a ''distributional'' solution.
 
The main goal of the classical optimal transport theory is how to find the least expensive way to move one measure to the another one. For more information, look at [http://34.106.105.83/wiki/Monge_Problem Monge Problem].So, we have to impose initial and terminal conditions on measures, for example <math> \mu_{0} = \mu  </math>, and <math> \mu_{1} = \nu.  </math> Then, our equation becomes <math> \int_{0}^{T}\int_{\Omega} [\partial_{t}\phi + \nabla\phi\cdot v_{t}] d\mu_{t} dt = \int_{\Omega}\phi(T,x)d\nu(x) - \int_{\Omega}\phi(0,x)d\mu(x),</math> for all <math> \phi \in C_{c}^{1}([0,T]\times \overline{\Omega}). </math>
 
:* '''Weak solution.''' Another way to interpret solutions to the continuity equation is to assume that function <math> t \rightarrow \int_{\Omega} \psi d\mu_{t}</math> is absolutely continuous, and for a.e. <math>t</math> it holds: <math> \partial_{t} \int_{\Omega} \psi d\mu_{t} = \int_{\Omega} \nabla\psi \cdot v_{t}d\mu_{t}, </math> for all test functions <math> \psi \in C_{c}^{1}(\overline{\Omega}) </math> This kind of solution is called a ''weak'' solution.


:* (Distributional solutions) All the measures we are interested in satisfy <math> \int_{0}^{1} ||v_{t}||_{L^{1}(\mu_{t})}dt < \infty </math>, and solve continuity equation in a distributional sense, namely
Some connections between these two types of solutions are given in the following propositions.  
    <math> \int_{0}^{T}\int_{\Omega} (\partial_{t}\phi) d\mu_{t} dt + \int_{0}^{T}\int_{\Omega} \nabla\phi\cdot v_{t} d\mu_{t} dt = 0, </math> 
for all <math> \phi \in C_{c}^{1}((0,T)X\Omega) </math>, where <math> \Omega </math> is a compact set or the whole space <math> \mathbb{R}^{d}</math>, and <math> 0<T<1 </math>. We assume no-flux condition in this case, namely <math> \mu_{t}v_{t} \cdot n = 0 </math> on the boundary <math> \partial\Omega. </math>


The main goal of the classical optimal transport theory is how to find the least expensive way to move one measure to the another one. For more information, look at [http://34.106.105.83/wiki/Monge_Problem Monge Problem].So, we have to impose initial and terminal conditions on measures, for example <math> \mu_{0} = \mu  </math>, and <math> \mu_{1} = \nu. </math> Then, our equation becomes
: '''Proposition 1.''', (p.124,<ref name=Santambrogio />) Distributional and weak solutions are equivalent. Every weak solution is a distributional solution. On the other hand, every distributional solution admits a representative (a.e. equal), that is weakly continuous and a weak solution.  
    <math> \int_{0}^{T}\int_{\Omega} (\partial_{t}\phi) d\mu_{t} dt + \int_{0}^{T}\int_{\Omega} \nabla\phi\cdot v_{t} d\mu_{t} dt = \int_{\Omega}\phi(T,x)d\nu(x) - \int_{\Omega}\phi(0,x)d\mu(x),</math>
for all <math> \phi \in C_{c}^{1}([0,T]X\Omega). </math> This notion of solution is called a distributional solution.


:* (Weak solution) Another way to interpret solutions to the continuity equation is to assume that function <math> t \rightarrow \int_{\Omega} \psi d\mu_{t}</math> is absolutely continuous, and for a.e. <math>t</math> it holds:
: '''Proposition 2.''',(p.124,<ref name=Santambrogio />) Let <math> \mu </math> be the Lipschitz function in <math> (t,x) </math> and <math> v </math> be the Lipschitz function in <math> x.</math> Suppose that the continuity equation is satisfied in the weak sense. Then it is satisfied in a.e. sense.
  <math> \partial_{t} \int_{\Omega} \psi d\mu_{t} = \int_{\Omega} \nabla\phi \cdot v_{t}. </math>
This kind of solution is called a weak solution.


Proposition 4.2., on the page 124., in the book by Santambrogio<ref name="Santambrogio" /> states that these solutions are basically equivalent.


:* (Lipschitz solution) There is also a third way to think about this solutions, using the Lipschitz functions, and it is contained in the following:
The following theorem will provide us with existence and uniqueness of the continuity equation solution. For simplicity, we will assume that <math> \Omega = \mathbb{R}^{d}.</math>


: '''Proposition.'''<ref name=Santambrogio /> Let <math> \mu </math> be the Lipschitz function in <math> (t,x) </math> and <math> v </math> be the Lipschitz function in <math> x.</math> Suppose that the continuity equation is satisfied in the weak sense. Then it is satisfied in a.e. sense.
: '''Theorem.'''<ref name=Santambrogio /> Let measurable function <math> v: (0,T) \times \mathbb{R}^{d} \rightarrow \mathbb{R}^{d} </math> be a Lipschitz continuous in <math> x </math>, uniformly in <math> t </math>, and uniformly bounded. Suppose that flow <math> X_{t}</math> of the classical ODE problem, with function <math> v </math> exists. Then, for any probability measure <math>\mu_{0}</math>, push-forward measures <math> \mu_{t} = X_{t}\#\mu_{0}</math> satisfy the continuity equation with the initial condition <math>\mu_{0}</math>. Moreover, for all measures <math> \mu_{t} </math> absolutely continuous with respect to Lebesgue measure, the previous solution is the only solution the continuity equation admits.


Previous three definitions and connections help us to conclude this section by referencing to Theorem of Cauchy-Lipschitz(<ref name="Ambrosio" />, p. 184). It is the analogue of the Picard-Lindelof Theorem in ODE theory, and it provides us with the unique solution, which is crucial for finding a proper geodesics in the applications.
: ''Sketch of the Proof.''
Proving existence, or checking that <math> \mu_{t} </math> satisfies the continuity equation in a sense of the weak solution is straightforward, using change of variables in the integral. However, resolving the uniqueness of this solution when it is absolutely continuous with respect to Lebesgue measure requires narrowing a test function space, using distributional solution. Hence, we can control the flow <math>X_{t}</math> in a better way, and using a linear transport equation solution, we can prove the uniqueness of solution of our continuity equation.


== Applications ==
== Applications ==
Line 52: Line 64:
The following theorem can be found at the book by L.Ambrosio, E.Brué, and D.Semola<ref name="Ambrosio" />.
The following theorem can be found at the book by L.Ambrosio, E.Brué, and D.Semola<ref name="Ambrosio" />.


: '''Theorem (Benamou-Brenier Formula).'''<ref name=Santambrogio /> Let <math> \mu, \nu \in \mathcal{P}_{2}(\mathbb{R}^{d}) </math>. Then  
: '''Theorem (Benamou-Brenier Formula).'''<ref name="Ambrosio" /> Let <math> \mu, \nu \in \mathcal{P}_{2}(\mathbb{R}^{d}) </math>. Then  
      <math> W_{2}^{2}(\mu, \nu)=\min_{(\mu(t),\nu(t))} \{\int_{0}^{1} |v(\cdot,t)|_{L^{2}(\mu(t))}^{2}dt \quad | \quad \partial_{t}\mu+\nabla\cdot(v\mu)=0,\quad \mu(0)=\mu,\quad \mu(1)=\nu \}. </math>
:<math> W_{2}^{2}(\mu, \nu)=\min_{(\mu(t),\nu(t))} \{\int_{0}^{1} |v(\cdot,t)|_{L^{2}(\mu(t))}^{2}dt \quad | \quad \partial_{t}\mu+\nabla\cdot(v\mu)=0,\quad \mu(0)=\mu,\quad \mu(1)=\nu \}. </math>


This formula is important for defining Riemannian structure. You can see more at [http://34.106.105.83/wiki/Formal_Riemannian_Structure_of_the_Wasserstein_metric Formal Riemannian Structure of the Wasserstein metric].
This formula is important for defining the Riemannian structure of our Wasserstein space. In particular, this is related to the fact that the distance minimizing curves on Riemannian manifolds are geodesics. The property at play here is the fact that geodesics minimize the action of the Riemannian metric. In other words, an optimal transport plan between measures <math> \mu </math> and <math> \nu </math> is actually related to a geodesic connecting these measures on a Riemannian manifold. The Benamou-Brenier Formula implies that the correct action on this space is <math> \min_{(\mu(t),\nu(t))} \int_{0}^{1} |v(\cdot,t)|_{L^{2}(\mu(t))}^{2}dt </math> where <math> v(\cdot, t) </math> gives the velocity of the curve <math> \mu </math>. This implies the correct Riemannian metric for our Wasserstein space. You can see more at [http://34.106.105.83/wiki/Formal_Riemannian_Structure_of_the_Wasserstein_metric Formal Riemannian Structure of the Wasserstein metric].


In addition, using the continuity equation we can describe geodesics in the Wasserstein space. For more details look at [http://34.106.105.83/wiki/Geodesics_and_generalized_geodesics Geodesics and generalized geodesics].
In addition, using the continuity equation we can describe geodesics in the Wasserstein space. For more details look at [http://34.106.105.83/wiki/Geodesics_and_generalized_geodesics Geodesics and generalized geodesics].

Latest revision as of 00:47, 6 March 2022

Introduction

The continuity equation is an important equation in many fields of science, for example, electromagnetism, computer vision, fluid dynamics etc. However, in the field of optimal transport, the formulation from fluid dynamics is of a large significance. This form helps to explain the dynamic formulation of special cases of Wasserstein metric via the continuity equation, and we will focus in this direction. Related to this is the Benamou-Brenier Formula, which implies a Riemannian structure on our space of measures. For more general information about the continuity equation, look at the article Continuity equation.

Continuity equation in fluid dynamics

First, because of the intuition, we will introduce the definition of the continuity equation in fluid mechanics. The exposition in this section will follow the book by Chorin and Marsden[1].

Suppose that mass of our fluid is conserved, through time. Denote as a density function, representing the mass-density of fluid, and as a velocity of particle at position , at time . Then, for any subspace of we have:

In this section, we assume both density function and particle velocity are smooth enough. Hence, after differentiating under the integral and applying the Divergence Theorem, we get:

Finally, we conclude that:

which implies, since is arbitrary, that:

The last equation is the continuity equation in fluid dynamics, written in the differential form. We use the equation in this form in optimal transport.

An important perspective comes from viewing the fluid as an system of particles moving in space (Langrangian perspective), as opposed to some continuous density that varies at specific points (Eulerian perspective) due to internal currents. This alternate perspective can be formalized via an ordinary differential equation[2].

where is some vector field varying with time, and is absolutely continuous. Since we are not assuming our curve is differentiable, we can instead consider the integral form of our ODE

In both cases, represents the starting point of our curve, which indicates the path of a specific particle in our fluid. The important fact is that the continuity equation and our ODE system are equivalent in a weak sense when given the same . Intuitively, the continuity equation is stating the the change in density at a point is dictated by the flow of particles, into that point. From the Lagrangian point of view, this flow is the result of all the trajectories of particles that follow the current into the desired point.

Continuity equation in optimal transport

The previous discussion assumed that the density function was smooth, which is not true of the general measures we consider in optimal transport. Even when a measure is absolutely continuous with respect to Lebesgue measure, which we write with a mild abuse of notation as , does not have to be smooth. So, we need to state a proper weak formulation of the continuity equation. Smooth functions satisfy all the cases below.

Here, we will present definitions and reasoning from book by F.Santambrogio[3].

From this point, we are looking at the following equation:

We will give two different notions of solutions to the continuity equation.

  • Distributional solution. All the measures we are interested in satisfy , and solve continuity equation in a distributional sense, namely
for all bounded Lipschitz functions , where is a bounded domain or the whole space , and . We assume no-flux condition in this case, namely on the boundary This notion of solution is called a distributional solution.

The main goal of the classical optimal transport theory is how to find the least expensive way to move one measure to the another one. For more information, look at Monge Problem.So, we have to impose initial and terminal conditions on measures, for example , and Then, our equation becomes for all

  • Weak solution. Another way to interpret solutions to the continuity equation is to assume that function is absolutely continuous, and for a.e. it holds: for all test functions This kind of solution is called a weak solution.

Some connections between these two types of solutions are given in the following propositions.

Proposition 1., (p.124,[3]) Distributional and weak solutions are equivalent. Every weak solution is a distributional solution. On the other hand, every distributional solution admits a representative (a.e. equal), that is weakly continuous and a weak solution.
Proposition 2.,(p.124,[3]) Let be the Lipschitz function in and be the Lipschitz function in Suppose that the continuity equation is satisfied in the weak sense. Then it is satisfied in a.e. sense.


The following theorem will provide us with existence and uniqueness of the continuity equation solution. For simplicity, we will assume that

Theorem.[3] Let measurable function be a Lipschitz continuous in , uniformly in , and uniformly bounded. Suppose that flow of the classical ODE problem, with function exists. Then, for any probability measure , push-forward measures satisfy the continuity equation with the initial condition . Moreover, for all measures absolutely continuous with respect to Lebesgue measure, the previous solution is the only solution the continuity equation admits.
Sketch of the Proof.

Proving existence, or checking that satisfies the continuity equation in a sense of the weak solution is straightforward, using change of variables in the integral. However, resolving the uniqueness of this solution when it is absolutely continuous with respect to Lebesgue measure requires narrowing a test function space, using distributional solution. Hence, we can control the flow in a better way, and using a linear transport equation solution, we can prove the uniqueness of solution of our continuity equation.

Applications

The following theorem can be found at the book by L.Ambrosio, E.Brué, and D.Semola[2].

Theorem (Benamou-Brenier Formula).[2] Let . Then

This formula is important for defining the Riemannian structure of our Wasserstein space. In particular, this is related to the fact that the distance minimizing curves on Riemannian manifolds are geodesics. The property at play here is the fact that geodesics minimize the action of the Riemannian metric. In other words, an optimal transport plan between measures and is actually related to a geodesic connecting these measures on a Riemannian manifold. The Benamou-Brenier Formula implies that the correct action on this space is where gives the velocity of the curve . This implies the correct Riemannian metric for our Wasserstein space. You can see more at Formal Riemannian Structure of the Wasserstein metric.

In addition, using the continuity equation we can describe geodesics in the Wasserstein space. For more details look at Geodesics and generalized geodesics.

References