Entropic Regularization

From Optimal Transport Wiki
Revision as of 07:04, 9 May 2020 by Jose (talk | contribs)
Jump to navigation Jump to search

Entropic Regularization

Due the convex structure of the Kantorovich problem, it is natural to turn to methods that, upon discretizing, exploit the fact that we aim to solve a linear programming problem. Upon replacing our measures with and for some appropriately determined and fixed The resulting minimization problem becomes:



which we may approach using classical methods that exploit the structure of the resulting linear system.

The Entropic Regularization

In studying a slight variant to this problem, for fixed we may now instead focus on tackling


which converges as Rewriting the objective function at hand by using

where and \textbf{KL} denotes the divergence (based on the relative entropy).

Thus, from here, this minimization problem reads as a projection from this divergence, of the point on the set of constraints


In particular, this Kullback-Leibler divergence has the property that the projection onto the intersection of two linear subspaces such as can be achieved by alternate projections. This is done by defining and with and looking at the limit as we take .

As such, the problem now becomes of projecting onto one of the linear subspaces, which can be done. If we are given a , we wish to solve

which we can solve an approximated problem of with the following algorithm.

Algorithm

  • Start from
  • define with
  • then define with

The limit we observe as provides us with the solution to the regularized problem. This idea is called the iterative proportional fitting procedure (). In utilizing this entropic regularization, this algorithm lends itself to the following advantages:

  • we may compute the projections explicitly at every step the moment we need only project to or
  • it is quite memory-efficient since the cumulating product of the coefficients appear at each step, thus allowing us to only store as opposed to
  • it can be easily parallelized given the separability of the structure at any given step.

References

<references> [1] </references.