Pour vous authentifier, privilégiez eduGAIN / To authenticate, prefer eduGAINeu

Learning in Restricted Boltzmann Machines: The Critical Role of Sampling Processes

19 sept. 2023, 10:00
1h
Institut Pascal

Institut Pascal

Bâtiment 530, Rue André Rivière 91400 Orsay

Orateur

Aurélien DECELLE (Universidad Complutense de Madrid)

Description

Generative models aim to learn the empirical distribution of a given data set in order to build a probabilistic model capable of generating new samples that are statistically similar to the data set. One can also assume that one can obtain an approximately tractable analytical description of this distribution.

In this presentation, I will specifically consider the case of the so-called Restricted Boltzmann Machine (RBM), a bipartite generative neural network where the learning process crucially depends on sampling: The gradient is computed using a Monte Carlo estimate of the correlation between the variables of the model. When we deal with multimodal datasets, accurate sampling during the learning process becomes increasingly challenging. This complexity arises when different modes begin to manifest in the model during training, resulting in the rapid divergence of chain mixing times to impractical values.

First, I will show how by means of a bias Monte Carlo method, it is feasible to drastically speed up the mixing time for structured datasets living in a low-dimensional space. Second, I will explore a mean-field method, in which we can expand the system into a low-dimensional subspace where we can efficiently approximate the free energy landscape and design a convex learning algorithm - at the cost of neglecting fluctuations that lie outside the subspace.

I will conclude with an open discussion on the limitation surrounding these techniques.

References by importance order regarding discussion time:
1a - https://scipost.org/SciPostPhys.14.3.032
1b - https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.127.158303 (arXiv:2103.10755v2)
2 - https://journals.aps.org/prl/pdf/10.1103/PhysRevLett.108.165701?casa_token=0YgEmURTLA0AAAAA%3A5sVciafIrVvzUZsjL-B8P7kFEHBFQFHQ4nyaJ2cQOShzN1tXE2mo5jlUedKizf0tbTNmZ2x4wb2PJg (arXiv:1103.2599v3)
3 - https://proceedings.neurips.cc/paper_files/paper/2021/hash/2aedcba61ca55ceb62d785c6b7f10a83-Abstract.html

Documents de présentation