The Moreau-Yosida Regularization: Difference between revisions

Revision as of 04:23, 9 February 2022

(to be filled in)

Motivation

(to be filled in)

Definitions

Let $(X,d)$ be a metric space. A function $g:X\to (-\infty ,+\infty ]$ is said to be proper if it is not identically equal to $+\infty$ , that is, if there exists $x\in X$ such that $g(x)<+\infty$ .

For a given function $g:X\to (-\infty ,+\infty ]$ and $k\geq 0$ , its Moreau-Yosida regularization $g_{k}:X\to [-\infty ,+\infty ]$ is given by

 $g_{k}(x):=\inf \limits _{y\in X}\left[g(y)+kd(x,y)\right].$

Note that

g_{k}(x)=\inf \limits _{y\in X}\left[g(y)+kd(x,y)\right]\leq g(x)+kd(x,x)=g(x)

.

Examples

If $k=0$ , then by definition $g_{0}$ is constant and $g_{0}\equiv \inf \limits _{y\in X}g(y)$ .
If $g$ is not proper, then $g_{k}=+\infty$ for all $k\geq 0$ .

Take $(X,d):=(\mathbb {R} ,|\cdot |)$ . If $g$ is finite-valued and differentiable, we can explicitly write down $g_{k}$ . Then for a fixed $x\in \mathbb {R}$ , the map $g_{k,x}:y\mapsto g(y)+k|x-y|$ is continuous everywhere and differentiable everywhere except for when $y=x$ , where the derivative does not exist due to the absolute value. Thus we can apply standard optimization techniques from Calculus to solve for $g_{k}(x)$ : find the critical points of $g_{k,x}$ and take the infimum of $g_{k,x}$ evaluated at the critical points. One of these values will always be the original function $g$ evaluated at $x$ , since this corresponds to the critical point $y=x$ for $g_{k,x}$ .

Let $g(x):=x^{2}$ . Then

g_{k}(x)=\min \left\{x^{2},{\frac {k^{2}}{2}}+k\left|x\pm {\frac {k}{2}}\right|\right\}.

Plot of

g(x)=x^{2}

and

g_{k}(x)

for

k=0,1,2,3

.

Results

Proposition. ^[1]^[2]

If $g$ is proper and bounded below, so is $g_{k}$ . Furthermore, $g_{k}$ is continuous for all $k\geq 0$ .
If, in addition, $g$ is lower semicontinuous, then $g_{k}(x)\nearrow g(x)$ for all $x\in X$ .
In this case, $g_{k}\wedge k:=\min(g_{k},k)$ is continuous and bounded and $g_{k}(x)\wedge k\nearrow g(x)$ for all $x\in X$ .

Proof.

Since $g$ is proper, there exists $y_{0}\in X$ such that $g(y_{0})<+\infty$ . Then for any $x\in X$

-\infty <\inf \limits _{y\in Y}g(y)\leq g_{k}(x)\leq g(y_{0})+kd(x,y_{0})<+\infty .

Thus $g_{k}$ is proper and bounded below. Next, for a fixed $y\in X$ , let $h_{k,y}(x):=g(y)+d(x,y)$ . Then as

h_{k,y}(x_{1})-h_{k,y}(x_{2})=kd(x_{1},y)-kd(x_{2},y)\leq kd(x_{1},x_{2})

,

the family $\{h_{k,y}\}_{y\in X}$ is uniformly Lipschitz and hence equicontinuous. Thus $g_{k}=\inf \limits _{y\in Y}h_{k,y}$ is Lipschitz continuous.

Suppose that $g$ is also lower semicontinuous. Note that for all $k_{1}\leq k_{2}$ , $g_{k_{1}}(x)\leq g_{k_{2}}(x)\leq g(x)$ . Thus it suffices to show that $\liminf \limits _{k\to \infty }g_{k}(x)\geq g(x)$ . This inequality is automatically satisfied when the left hand side is infinite, so without loss of generality assume that $\liminf \limits _{k\to \infty }g_{k}(x)<+\infty$ . By definition of infimum, for each $k\in \mathbb {N}$ there exists $y_{k}\in X$ such that

g(y_{k})+kd(x,y_{k})\leq g_{k}(x)+{\frac {1}{k}}

. Then

+\infty >\liminf \limits _{k\to \infty }g_{k}(x)\geq \liminf \limits _{k\to \infty }\left[g(y_{k})+kd(x,y_{k})\right].

$g(y_{k})$ is bounded below by assumption, while the only way $kd(x,y_{k})$ is finite in the limit is for $d(x,y_{k})$ to go to zero. Thus $y_{k}$ converges to $x$ in $X$ , and by lower semicontinuity of $g$ ,

\liminf \limits _{k\to \infty }g_{k}(x)\geq \liminf \limits _{k\to \infty }\left[g(y_{k})+kd(x,y_{k})\right]\geq g(x)

.

By definition, $g_{k}\wedge k\in C_{b}(X)$ . Since $g_{k}(x)\nearrow g(x)$ for all $x\in X$ , $g_{k}(x)\wedge k\nearrow g(x)$ for all $x\in X$ .

References

Possible list of references, will fix accordingly

Bauschke-Combette Ch 12.^[3]; Santambrogio (6)^[2]; Ambrosio-Gigli-Savare (59-61)^[4]

↑ Craig, Katy C. Lower Semicontinuity in the Narrow Topology. Math 260J. Univ. of Ca. at Santa Barbara. Winter 2022.
↑ ^2.0 ^2.1 Santambrogio, Filippo. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling Ch. 1.1. Birkhäuser, 2015.
↑ Bauschke, Heinz H. and Patrick L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd Ed. Ch. 12. Springer, 2017.
↑ Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savaré. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Ch. 3.1. Birkhäuser, 2005.

[OT-1] Craig, Katy C. Lower Semicontinuity in the Narrow Topology. Math 260J. Univ. of Ca. at Santa Barbara. Winter 2022.

[S-2] 2.0 ^2.1 Santambrogio, Filippo. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling Ch. 1.1. Birkhäuser, 2015.

[BC-3] Bauschke, Heinz H. and Patrick L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd Ed. Ch. 12. Springer, 2017.

[AGS-4] Ambrosio, Luigi, Nicola Gigli, and Giuseppe Savaré. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Ch. 3.1. Birkhäuser, 2005.

[1]

[2]

[3]

[4]

@@ Line 10: / Line 10: @@
 For a given function <math>g : X \to (-\infty,+\infty]</math> and <math>k \geq 0</math>, its '''Moreau-Yosida regularization''' <math>g_k : X \to [-\infty,+\infty]</math> is given by
-:<math>g_k(x) := \inf\limits_{y \in X} \left[ g(y) + k d(x,y) \right].</math>
+ <math>g_k(x) := \inf\limits_{y \in X} \left[ g(y) + k d(x,y) \right].</math>
+Note that
+:<math>g_k(x) = \inf\limits_{y \in X} \left[ g(y) + k d(x,y) \right] \leq g(x) + k d(x,x) = g(x)</math>.
 ==Examples==
@@ Line 16: / Line 20: @@
 * If <math>g</math> is ''not'' proper, then <math>g_k = +\infty</math> for all <math>k \geq 0</math>.
-Take <math>(X,d) := (\mathbb{R},|\cdot|)</math>. If <math>g</math> is finite-valued and differentiable, we can explicitly write down <math>g_k</math>. Then for a fixed <math>x \in \mathbb{R}</math>, the map <math>g_{k,x} : y \mapsto y^2 + k|x - y|</math> is continuous everywhere and differentiable everywhere except for when <math>y = x</math>, where the derivative does not exist due to the absolute value. Thus we can apply standard optimization techniques from Calculus to solve for <math>g_k(x)</math>: find the critical points of <math>g_{k,x}</math> and take the infimum of <math>g_{k,x}</math> evaluated at the critical points. One of these values will always be the original function <math>g</math> evaluated at <math>x</math>, since this corresponds to the critical point <math>y = x</math> for <math>g_{k,x}</math>.
+Take <math>(X,d) := (\mathbb{R},|\cdot|)</math>. If <math>g</math> is finite-valued and differentiable, we can explicitly write down <math>g_k</math>. Then for a fixed <math>x \in \mathbb{R}</math>, the map <math>g_{k,x} : y \mapsto g(y) + k|x - y|</math> is continuous everywhere and differentiable everywhere except for when <math>y = x</math>, where the derivative does not exist due to the absolute value. Thus we can apply standard optimization techniques from Calculus to solve for <math>g_k(x)</math>: find the critical points of <math>g_{k,x}</math> and take the infimum of <math>g_{k,x}</math> evaluated at the critical points. One of these values will always be the original function <math>g</math> evaluated at <math>x</math>, since this corresponds to the critical point <math>y = x</math> for <math>g_{k,x}</math>.
 * Let <math>g(x) := x^2</math>.  Then
 :<math>g_k(x) = \min \left\{ x^2 , \frac{k^2}{2} + k \left| x \pm \frac{k}{2} \right| \right\}.</math>
-[[File:Ex 1.png|alt=Plot of <math>g(x) = x^2</math> and <math>g_k(x)</math> for <math>k = 0, 1, 2, 3</math>.]]
+[[File:Ex 1.png|300px|thumb|Plot of <math>g(x) = x^2</math> and <math>g_k(x)</math> for <math>k = 0, 1, 2, 3</math>.]]
 ==Results==
-'''Proposition.''' <ref name="OT"/>
+'''Proposition.''' <ref name="OT"/><ref name="S"/>
 * If <math>g</math> is proper and bounded below, so is <math>g_k</math>. Furthermore, <math>g_k</math> is continuous for all <math>k \geq 0</math>.
 * If, in addition, <math>g</math> is lower semicontinuous, then <math>g_k(x) \nearrow g(x)</math> for all <math>x \in X</math>.
 * In this case, <math>g_k \wedge k := \min(g_k,k)</math> is continuous and bounded and <math>g_k(x) \wedge k \nearrow g(x)</math> for all <math>x \in X</math>.
+'''Proof.'''
+* Since <math>g</math> is proper, there exists <math>y_0 \in X</math> such that <math>g(y_0) < +\infty</math>. Then for any <math>x \in X</math>
+:<math> -\infty < \inf\limits_{y \in Y} g(y) \leq g_k(x) \leq g(y_0) + k d(x,y_0) < +\infty .</math>
+Thus <math>g_k</math> is proper and bounded below. Next, for a fixed <math>y \in X</math>, let <math>h_{k,y}(x) := g(y) + d(x,y)</math>. Then as
+:<math> h_{k,y}(x_1) - h_{k,y}(x_2) = k d(x_1,y) - k d(x_2,y) \leq k d(x_1,x_2) </math> ,
+the family <math> \{ h_{k,y} \}_{y \in X} </math> is uniformly Lipschitz and hence equicontinuous. Thus <math>g_k = \inf\limits_{y \in Y} h_{k,y}</math> is Lipschitz continuous.
+* Suppose that <math>g</math> is also lower semicontinuous. Note that for all <math>k_1 \leq k_2</math>, <math>g_{k_1}(x) \leq g_{k_2}(x) \leq g(x)</math>. Thus it suffices to show that <math>\liminf\limits_{k \to \infty} g_k(x) \geq g(x)</math>. This inequality is automatically satisfied when the left hand side is infinite, so without loss of generality assume that <math>\liminf\limits_{k \to \infty} g_k(x) < +\infty</math>. By definition of infimum, for each <math>k \in \mathbb{N}</math> there exists <math>y_k \in X</math> such that
+:<math>g(y_k) + k d(x,y_k) \leq g_k(x) + \frac{1}{k}</math>. Then
+:<math>+\infty > \liminf\limits_{k \to \infty} g_k(x) \geq \liminf\limits_{k \to \infty} \left[ g(y_k) + k d(x,y_k) \right].</math>
+<math>g(y_k)</math> is bounded below by assumption, while the only way <math>kd(x,y_k)</math> is finite in the limit is for <math>d(x,y_k)</math> to go to zero. Thus <math>y_k</math> converges to <math>x</math> in <math>X</math>, and by lower semicontinuity of <math>g</math>,
+:<math> \liminf\limits_{k \to \infty} g_k(x) \geq \liminf\limits_{k \to \infty} \left[ g(y_k) + k d(x,y_k) \right] \geq g(x) </math>.
+* By definition, <math>g_k \wedge k \in C_b(X)</math>. Since <math>g_k(x) \nearrow g(x)</math> for all <math>x \in X</math>, <math>g_k(x) \wedge k \nearrow g(x)</math> for all <math>x \in X</math>.
 ==References==

The Moreau-Yosida Regularization: Difference between revisions

Revision as of 04:23, 9 February 2022

Contents

Motivation

Definitions

Examples

Results

References

Navigation menu

The Moreau-Yosida Regularization: Difference between revisions

Revision as of 04:23, 9 February 2022

Motivation

Definitions

Examples

Results

References

Navigation menu

Search