The Moreau-Yosida Regularization: Difference between revisions
(added the origin of the name portmanteau, added internal and external links) |
(added informal convolution) |
||
Line 4: | Line 4: | ||
Let <math>(X,d)</math> be a metric space, and let <math>\mathcal{P}(X)</math> denotes the collection of probability measures on <math>X</math>. <math>(X,d)</math> is said to be a '''Polish space''' if it is complete and separable. | Let <math>(X,d)</math> be a metric space, and let <math>\mathcal{P}(X)</math> denotes the collection of probability measures on <math>X</math>. <math>(X,d)</math> is said to be a '''Polish space''' if it is complete and separable. | ||
A function <math>g : X \to (-\infty,+\infty]</math> is said to be '''proper''' if it is not identically equal to <math>+\infty</math>, that is, if there exists <math>x \in X</math> such that <math>g(x) < +\infty</math>. | A function <math>g : X \to (-\infty,+\infty]</math> is said to be '''proper''' <ref name="OT"/> if it is not identically equal to <math>+\infty</math>, that is, if there exists <math>x \in X</math> such that <math>g(x) < +\infty</math>. | ||
For a given function <math>g : X \to (-\infty,+\infty]</math> and <math>k \geq 0</math>, its '''Moreau-Yosida regularization''' <math>g_k : X \to [-\infty,+\infty]</math> is given by | For a given function <math>g : X \to (-\infty,+\infty]</math> and <math>k \geq 0</math>, its '''Moreau-Yosida regularization''' <ref name="OT"/> <math>g_k : X \to [-\infty,+\infty]</math> is given by | ||
<math>g_k(x) := \inf\limits_{y \in X} \left[ g(y) + k d(x,y) \right].</math> | <math>g_k(x) := \inf\limits_{y \in X} \left[ g(y) + k d(x,y) \right].</math> | ||
Line 57: | Line 57: | ||
==Portmanteau Theorem== | ==Portmanteau Theorem== | ||
'''Theorem (Portmanteau).''' <ref name="OT" /> <ref name="S"/> Let <math>(X,d)</math> be a Polish space, and let <math>g : X \to (-\infty,+\infty]</math> be lower semicontinuous and bounded below. Then the functional <math>\mu \mapsto \int_X g \, \mathrm{d}\mu</math> is lower semicontinuous with respect to narrow convergence in <math>\mathcal{P}(X)</math>, that is | '''Theorem (Portmanteau).''' <ref name="OT"/> <ref name="S"/> Let <math>(X,d)</math> be a Polish space, and let <math>g : X \to (-\infty,+\infty]</math> be lower semicontinuous and bounded below. Then the functional <math>\mu \mapsto \int_X g \, \mathrm{d}\mu</math> is lower semicontinuous with respect to narrow convergence in <math>\mathcal{P}(X)</math>, that is | ||
<math> \mu_n \to \mu \text{ narrowly} \Longrightarrow \liminf\limits_{n \to \infty} \int_X g_n \, \mathrm{d}\mu \geq \int_X g \, \mathrm{d}\mu </math>. | <math> \mu_n \to \mu \text{ narrowly} \Longrightarrow \liminf\limits_{n \to \infty} \int_X g_n \, \mathrm{d}\mu \geq \int_X g \, \mathrm{d}\mu </math>. | ||
Line 70: | Line 70: | ||
==The Mysterious Etymology of Portmanteau== | ==The Mysterious Etymology of Portmanteau== | ||
The curious epithet attached to the above theorem is due to Billingsley <ref name=" | The curious epithet attached to the above theorem is due to Billingsley <ref name="Billingsley"/>, with a citation to a Jean-Pierre Portmanteau's ''Espoir pour l'ensemble vide?'' published in ''Annales de l'Université de Felletin'' in 1915. This is believed to be a fictional citation made as a play on words <ref name="Pages"/>. | ||
* The publication date is far too early; Kolmogorov's probability axioms were published in 1933. <ref name="Kolmogorov"/> | * The publication date is far too early; Kolmogorov's probability axioms were published in 1933. <ref name="Kolmogorov"/> | ||
* [https://en.wikipedia.org/wiki/Felletin Felletin] is a small town in central France with no university, and there is no record of a Jean-Pierre Portmanteau aside from this citation. | * [https://en.wikipedia.org/wiki/Felletin Felletin] is a small town in central France with no university, and there is no record of a Jean-Pierre Portmanteau aside from this citation. | ||
* "Espoir pour l'ensemble vide" translates to "hope for the empty set" (translation was by Google, please confirm or amend if you speak French!) | * "Espoir pour l'ensemble vide" translates to "hope for the empty set" (translation was by Google, please confirm or amend if you speak French!) | ||
==Generalizations== | |||
The Moreau-Yosida regularization is a specific case of a type of convolution, and many of the above results follow from this generalization. This material is taken from Bauschke-Combette Ch 12 <ref name="BC"/>, where the setting is over a Hilbert space instead of a more general Polish space. | |||
Let <math>\mathcal{H}</math> be a Hilbert space, and let <math>f , g : \mathcal{H} \to (-\infty,+\infty]</math>. The '''informal convolution''' or '''epi-sum''' <math>f \, \square \, g : \mathcal{H} \to [-\infty,+\infty]</math> of <math>f</math> and <math>g</math> is | |||
<math> (f \, \square \, g)(x) := \inf\limits_{y \in \mathcal{H}} \left[ f(y) + g(x-y) \right] </math>. | |||
<math>f \, \square \, g</math> is said to be '''exact''' at a point <math>x \in \mathcal{H}</math> if this infimum is attained. <math>f \, \square \, g</math> is said to be exact if it is exact at every point of its domain, and in this case it is denoted by <math>f \, \dot{\square} \, g</math>. | |||
'''Remark.''' Bauschke-Combette uses a box with a dot in the middle for <math>f \, \square \, g</math> to be exact. Due to technical difficulties, we will use <math>f \, \dot{\square} \, g</math> instead. | |||
For an example, let <math>A, B \subseteq \mathcal{H}</math> be nonempty. Then <math>\chi_A \, \square \, \chi_B</math> is exact, and <math>\chi_A \, \dot{\square} \, \chi_B = \chi_{A + B}</math>. | |||
==References== | ==References== | ||
Line 80: | Line 95: | ||
<references> | <references> | ||
<ref name="AGS">Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savaré. ''Gradient Flows in Metric Spaces and in the Space of Probability Measures.'' Ch. 3.1. Birkhäuser, 2005.</ref> | <ref name="AGS">Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savaré. ''Gradient Flows in Metric Spaces and in the Space of Probability Measures.'' Ch. 3.1. Birkhäuser, 2005.</ref> | ||
<ref name="Billingsley">Billingsley, Patrick. ''Convergence of Probability Measures, 2nd Ed.'' John Wiley & Sons, Inc. 1999. </ref> | |||
<ref name="BC">Bauschke, Heinz H. and Patrick L. Combettes. ''Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd Ed.'' Ch. 12. Springer, 2017.</ref> | |||
<ref name="Kolmogorov">Kolmogorov, Andrey (1950) [1933]. Foundations of the theory of probability. New York, USA: Chelsea Publishing Company.</ref> | |||
<ref name="Pages">Pagès, Gilles. ''Numerical Probability: An Introduction with Applications to Finance.'' Ch. 4.1. Springer, 2018.</ref> | |||
<ref name="OT">Craig, Katy C. Lower Semicontinuity in the Narrow Topology. Math 260J. Univ. of Ca. at Santa Barbara. Winter 2022.</ref> | <ref name="OT">Craig, Katy C. Lower Semicontinuity in the Narrow Topology. Math 260J. Univ. of Ca. at Santa Barbara. Winter 2022.</ref> | ||
<ref name="S">Santambrogio, Filippo. ''Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling'' Ch. 1.1. Birkhäuser, 2015.</ref> | <ref name="S">Santambrogio, Filippo. ''Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling'' Ch. 1.1. Birkhäuser, 2015.</ref> | ||
<!-- | <!-- | ||
<ref name="P">https://math.stackexchange.com/questions/43747/where-did-the-portmanteau-theorem-get-its-name | <ref name="P">https://math.stackexchange.com/questions/43747/where-did-the-portmanteau-theorem-get-its-name |
Revision as of 03:23, 12 February 2022
The Moreau-Yosida regularization is a technique used to approximate lower semicontinuous functions by Lipschitz functions. The main application of this result is to prove Portmanteau's Theorem, which states that integration against a lower semicontinuous and bounded below function is lower semicontinuous with respect to the narrow topology in the space of probability measures.
Definitions
Let be a metric space, and let denotes the collection of probability measures on . is said to be a Polish space if it is complete and separable.
A function is said to be proper [1] if it is not identically equal to , that is, if there exists such that .
For a given function and , its Moreau-Yosida regularization [1] is given by
The distance term may often be raised to a positive exponent. For example, when is a Hilbert space [2] [3], is taken to be
Note that
- .
Examples
- If , then by definition is constant and .
- If is not proper, then for all .
Take . If is finite-valued and differentiable, we can explicitly write down . Then for a fixed , the map is continuous everywhere and differentiable everywhere except for when , where the derivative does not exist due to the absolute value. Thus we can apply standard optimization techniques from Calculus to solve for : find the critical points of and take the infimum of evaluated at the critical points. One of these values will always be the original function evaluated at , since this corresponds to the critical point for .
- Let . Then
Approximating Lower Semicontinuous Functions by Lipschitz Functions
Proposition. [1][4] Let be a Polish space and let .
- If is proper and bounded below, so is . Furthermore, is continuous for all .
- If, in addition, is lower semicontinuous, then for all .
- In this case, is continuous and bounded and for all .
Proof.
- Since is proper, there exists such that . Then for any
Thus is proper and bounded below. Next, for a fixed , let . Then as
- ,
the family is uniformly Lipschitz and hence equicontinuous. Thus is Lipschitz continuous.
- Suppose that is also lower semicontinuous. Note that for all , . Thus it suffices to show that . This inequality is automatically satisfied when the left hand side is infinite, so without loss of generality assume that . By definition of infimum, for each there exists such that
- .
Then
is bounded below by assumption, while the only way is finite in the limit is for to go to zero. Thus converges to in , and by lower semicontinuity of ,
- .
- By definition, . Since for all , for all .
Portmanteau Theorem
Theorem (Portmanteau). [1] [4] Let be a Polish space, and let be lower semicontinuous and bounded below. Then the functional is lower semicontinuous with respect to narrow convergence in , that is
.
Proof. By the Moreau-Yosida approximation, for all ,
- .
Taking , Fatou's Lemma ensures that
- .
The Mysterious Etymology of Portmanteau
The curious epithet attached to the above theorem is due to Billingsley [5], with a citation to a Jean-Pierre Portmanteau's Espoir pour l'ensemble vide? published in Annales de l'Université de Felletin in 1915. This is believed to be a fictional citation made as a play on words [6].
- The publication date is far too early; Kolmogorov's probability axioms were published in 1933. [7]
- Felletin is a small town in central France with no university, and there is no record of a Jean-Pierre Portmanteau aside from this citation.
- "Espoir pour l'ensemble vide" translates to "hope for the empty set" (translation was by Google, please confirm or amend if you speak French!)
Generalizations
The Moreau-Yosida regularization is a specific case of a type of convolution, and many of the above results follow from this generalization. This material is taken from Bauschke-Combette Ch 12 [2], where the setting is over a Hilbert space instead of a more general Polish space.
Let be a Hilbert space, and let . The informal convolution or epi-sum of and is
.
is said to be exact at a point if this infimum is attained. is said to be exact if it is exact at every point of its domain, and in this case it is denoted by .
Remark. Bauschke-Combette uses a box with a dot in the middle for to be exact. Due to technical difficulties, we will use instead.
For an example, let be nonempty. Then is exact, and .
References
- ↑ 1.0 1.1 1.2 1.3 Craig, Katy C. Lower Semicontinuity in the Narrow Topology. Math 260J. Univ. of Ca. at Santa Barbara. Winter 2022.
- ↑ 2.0 2.1 Bauschke, Heinz H. and Patrick L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd Ed. Ch. 12. Springer, 2017.
- ↑ Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savaré. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Ch. 3.1. Birkhäuser, 2005.
- ↑ 4.0 4.1 Santambrogio, Filippo. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling Ch. 1.1. Birkhäuser, 2015.
- ↑ Billingsley, Patrick. Convergence of Probability Measures, 2nd Ed. John Wiley & Sons, Inc. 1999.
- ↑ Pagès, Gilles. Numerical Probability: An Introduction with Applications to Finance. Ch. 4.1. Springer, 2018.
- ↑ Kolmogorov, Andrey (1950) [1933]. Foundations of the theory of probability. New York, USA: Chelsea Publishing Company.