2 layer neural networks as Wasserstein gradient flows: Difference between revisions

Revision as of 01:44, 10 February 2022

Artificial neural networks consist of layers of artificial "neurons" which take in information from the previous layer and output information to the next layer. Gradient descent is a common method for updating the weights of each neuron based on training data. While in practice every layer of a neural network has only finitely many neurons, it is beneficial to consider a continuous viewpoint of neural networks with infinitely many neurons in a layer for the sake of developing a theory that explains how ANNs work. In particular, from this viewpoint the process of updating the neuron weights for a shallow neural network can be described by a Wasserstein gradient flow.

Motivation

Shallow Neural Networks

Continuous Formulation

Minimization Problem

Wasserstein Gradient Flow

Main Results

References

↑ Xavier Fernandez-Real and Alessio Figalli, The Continuous Formulation of Shallow Neural Networks as Wasserstein-Type Gradient Flows

[Figalli-1] Xavier Fernandez-Real and Alessio Figalli, The Continuous Formulation of Shallow Neural Networks as Wasserstein-Type Gradient Flows

[1]

@@ Line 1: / Line 1: @@
 <ref name="Figalli" />
-Artificial neural networks consist of layers of artificial "neurons" which take in information from the previous layer and output information to the next layer. Gradient descent is a common method for updating the weights of each neuron based on training data. While in practice every layer of a neural network has only finitely many neurons, it is beneficial to consider a continuous viewpoint of neural networks with infinitely many neurons in a layer for the sake of developing a theory that explains how ANNs works. In particular, from this viewpoint the process of updating the neuron weights for a shallow neural network can be described by a Wasserstein gradient flow.
+Artificial neural networks consist of layers of artificial "neurons" which take in information from the previous layer and output information to the next layer. Gradient descent is a common method for updating the weights of each neuron based on training data. While in practice every layer of a neural network has only finitely many neurons, it is beneficial to consider a continuous viewpoint of neural networks with infinitely many neurons in a layer for the sake of developing a theory that explains how ANNs work. In particular, from this viewpoint the process of updating the neuron weights for a shallow neural network can be described by a Wasserstein gradient flow.
 ==Motivation==
@@ Line 7: / Line 7: @@
 ==Shallow Neural Networks==
 ===Continuous Formulation===
-==Minimization Problem==
+===Minimization Problem===
 ==Wasserstein Gradient Flow==

2 layer neural networks as Wasserstein gradient flows: Difference between revisions

Revision as of 01:44, 10 February 2022

Contents

Motivation

Shallow Neural Networks

Continuous Formulation

Minimization Problem

Wasserstein Gradient Flow

Main Results

References

Navigation menu

2 layer neural networks as Wasserstein gradient flows: Difference between revisions

Revision as of 01:44, 10 February 2022

Motivation

Shallow Neural Networks

Continuous Formulation

Minimization Problem

Wasserstein Gradient Flow

Main Results

References

Navigation menu

Search