Pour vous authentifier, privilégiez eduGAIN / To authenticate, prefer eduGAINeu

Learning to Discover

Institut Pascal

Institut Pascal


Overall program  

The agenda is final. Slides and talk recordings are publicly available from the "Timetable"

The event is run in hybrid mode, however, connection details are only made available to registered participants. .

  • Tue  19 - Wed 20 Apr : Workshop on Representation Learning from Heterogeneous/Graph-Structured Data
  • Thu  21 Apr - Fri 22 Apr  : Workshop on Dealing with Uncertainties 
  • Mon 25  - Tue 26 : Workshop on Generative Models
  • Wed 27 - Fri 29th : Learning to Discover  AI and Physics conference


Learning To Discover is a program on Artificial Intelligence and High Energy Physics (HEP) to take place at Institut Pascal Paris-Saclay 19th Apr 2022 to 29th Apr  2022, in its beautiful new building. Over the two weeks, three themes will be successively tackled during innovation-oriented sessions of two days each, followed by a  three days general conference on AI and physics.


Although many participants will be on-site, Institut Pascal is fully equipped with technology allowing remote participation as seamless as possible (to registered participants only).

Although vaccination is not mandatory anymore, it it still strongly recommended. Masks wearing strongly recommended in the building. CO2 levels are monitored throughout the building.


Since 2014, the field of AI and HEP has grown exponentially. Physicists have realised quickly the potential of AI to deal with the large amount of complex data they are collecting and analysing. Many AI techniques have been put forward, with scientific collaboration based on open data sets, challenges, workshops and papers.

Learning To Discover is a first-of-its-kind program where participants will have access to deep technical insights in advanced machine learning techniques, and their application to particle physics.
Three themes have tentatively been selected based on the one hand, their interest for HEP, and the fact there is already a number of HEP teams working on it, on the other hand, their importance in the Machine Learning field.

Representation Learning over Heterogeneous/Graph Data  workshop

HEP data are rarely image-like nor tabular. Despite promising applications of Graph Neural Network (GNN) to several HEP problems (tracking, classification, calorimeter reconstruction, particle flow, ...) there remains several bottlenecks towards getting models into production, in terms of both accuracy and inference speed. Multiple variants (Graph Attention Network, Transformers, ...) are available, with improved performance and resource requirements. The following topics are of interest: ways to extract information (at graph, node, or edge level), mechanisms for propagating information throughout the graph, generation of graph structure, evolution of graph topology within the model, etc...

Dealing with Uncertainties workshop 

Physicists ultimately write papers with measurements which always include an assessment of uncertainties. How to evaluate the uncertainty on a trained model ? How to build confidence into a ML model and how to convince our peers about it ? How to deal with the uncertainties on the input of the models to maximise the overall accuracy ? Bayesian Networks, Adversarial architectures and other techniques are being explored. 

Generative Models workshop

High Energy Physics is well known for having built accurate simulators which are able to deliver simulation of real experiments, from the Quantum Field Theory equations dealing with the interactions of fundamental particles up to specific answers of electronic circuits. These simulations are accurate, but not perfect, and very slow. Proof of Concept of using Generative Models (GAN, VAE,...) are being developed but many issues are to be solved before they make it to production. How to reach sufficient accuracy over a large parameter space ? How to make sure multi-dimension correlations are well described ?  How to deal with the irregularity of the detector (think of an image with pixels of different sizes and shapes) ?  

Physicists with concrete experience with Machine Learning (postdocs, advanced PhD students, ...) and Computer Scientists seasoned with the development of cutting edge ML models (expected participation from Google Deepmind, NVidia, IDIAP, ...) are invited to apply for attendance. Application is however open to everyone with experience and interest. The selection process will be as inclusive as possible, and only done so as to make the workshop fruitful for everyone and given the constraints of the premises.  The final general conference on AI and HEP is open to physicists and Computer Scientists until the maximum attendance is reached.

Learning to Discover : AI and High Energy Physics conference

The final conference will cover cutting edge topics in the field with keynote talks, summaries of the preceding workshops, as well as talks from the call for contributions.  


Learning To Discover is hosted at the new Université Paris-Saclay Institut Pascal, an ideal venue for such interactions to take place. Located in Orsay (close to Paris) in the heart of Université Paris-Saclay, it has offices for up to 60 people, meeting rooms, amphitheaters (up to 120 seats) and all amenities for a successful workshop. In the usual Institut Pascal format, lunches and social programme (one dinner per workshop or for the conference) is free. Accommodation can be booked in the area on Institut Pascal expense for  the workshops participants; travel is covered only if needed. Registration to the workshops is free (but moderated) as well as registration to the final conference. See "Support and Accommodation" item in the menu bar for more practical details. Participation is limited to 40 on site for each of the workshop, and 120 on site for the conference.


A special issue of Computing and Software for the Big Science (a refereed journal) will be prepared with the most relevant contributions with a submission deadline 4 months after the workshop, leaving ample time for finalisation of the studies showcased at the workshop. A number of workshops concerning AI and Physics have been organised in recent years, notably “Deep Learning for the Physical Science” at NeurIPS 2017 (https://dl4physicalsciences.github.io/), “Machine Learning and the Physical Science” at Neurips 2019, 2020 and 2021 (https://ml4physicalsciences.github.io) and “AI and physics” at AMLD 2020  and 2021 in Lausanne (https://appliedmldays.org/tracks/ai-physics). Studies developed at “Learning To Discover” will have a high chance at being accepted at future such events.

    • 8:45 AM
      Welcome at Institut Pascal
    • Representation Learning workshop: Tues Morning
      Convener: David Rousseau (IJCLab, Orsay, France)
      • 1
        Introduction from the organisers
        Speaker: David Rousseau (IJCLab, Orsay, France)
      • 2
        Welcome from Institut Pascal director
        Speaker: Denis ULLMO (Institut Pascal)
      • 3
        Introduction and review: machine learning for charged particle tracking
        Speaker: Jan Stark ( Laboratoire des 2 Infinis - Toulouse, CNRS / Univ. Paul Sabatier (FR))
      • 10:45 AM
        Coffee break
      • 4
        Geometric Deep Learning, Graph Neural Networks, and Neural Diffusion Equations

        Symmetry as an organising principle has played a pivotal role in Klein's Erlangen Programme unifying various types of geometry, in modern physics theory unifying different types of interactions. In machine learning, symmetry underlies Geometric Deep Learning, a group-theoretical framework for a principled design of geometric inductive biases by exploiting symmetries arising from the structure of the respective domains and data on these domains.

        In this talk, I will first showcase the Geometric Deep Learning Blueprint and how it can be used to derive some of the most popular deep learning architectures. Focusing on Graph Neural Networks (GNNs), I will make make connections to non-Euclidean diffusion equations and show that drawing on methods from the domain of differential geometry, it is possible to provide a principled view on such GNN architectural choices as positional encoding and graph rewiring as well as explain and remedy the phenomena of over-squashing and bottlenecks.

        Speaker: Michael Bronstein (Imperial College, London)
    • 5
      Welcome buffet
    • Representation Learning workshop: Tues afternoon
      Convener: Jean-Roch Vlimant (California Institute of Technology (US))
      • 6
        Overview of Machine Learning for Calorimeter and Particle Flow

        The reconstruction of particle signals relies on local reconstrution, which involves clustering of granular hits within detector subsystems, followed by global reconstruction, combining signals across detector subsystems for a high-level particle representation of the event. Calorimeter clustering is a local reconstruction method that aims to segment calorimeter hits according to their particle origin. Recently, in light of the future high-granularity detector configurations, considerable progress has been made in disentangling overlapping showers in highly granular detectors using machine learning. Once clusters and tracks are reconstructed, particle-flow algorithms combine the information globally across the detector for an optimized particle-level reconstruction. Machine learning approaches have recently been demonstrated to offer comparable performance to heuristic particle flow algorithms, while potentially allowing for native deployment on heterogeneous platforms. I will give a summary of the progress towards ML-based calorimeter reconstruction and particle flow.

        Speaker: Joosep Pata (NICPB, Tallinn)
      • 7
        Learning general purpose physical simulators

        Simulations are central to modeling complex physical systems in many disciplines across science and engineering. However, high-dimensional scientific simulations can be very expensive to run, and require specialized solvers. In this talk, we will review some of our recent work on a general purpose framework for learning grid-based, particle-based, and mesh-based simulations using convolutional neural networks and graph neural networks. We will show how learned simulators built with the same design principles can accurately predict the dynamics of a wide range of physical systems including fluids/turbulence, granular materials, aerodynamics, structural mechanics, and cloth, often leading to speed ups of 1-2 orders of magnitude compared to the simulation on which they are trained. Furthermore, we will show how the models are able to generalize to larger and more complex systems than those seen during training, and even can be used for inverse design. Our work broadens the range of problems on which neural network simulators can operate and promises to improve the efficiency of complex, scientific modeling tasks.

        Speaker: Alvaro Sanchez (Deepmind)
      • 4:00 PM
        Coffee break
      • 8
        Break-out sessions
    • 9
      Representation Learning Social Dinner at Brass & co
    • Representation Learning workshop: Wed morning
      Convener: Peter Battaglia (DeepMind)
      • 10
        Enabling Empirically and Theoretically Sound Algorithmic Alignment

        Neural networks that are able to reliably execute algorithmic computation may hold transformative potential to both machine learning and theoretical computer science. On one hand, they could enable the kind of extrapolative generalisation scarcely seen with deep learning models. On another, they may allow for running classical algorithms on inputs previously considered inaccessible to them.

        Both of these promises are shepherded by the neural algorithmic reasoning blueprint, which I have recently proposed in a position paper alongside Charles Blundell. On paper, this is a remarkably elegant pipeline for reasoning on natural inputs which carefully leverages the tried-and-tested power of deep neural networks as feature extractors. However, in practice, its success rests on successful realisation of algorithmic alignment -- neural networks that are capable of executing algorithmic computations in a high-dimensional latent space, ideally in a manner that extrapolates outside of the training set.

        In this talk, I will outline some of the recent work we have done on strengthening algorithmic alignment from both an empirical and a theoretical point of view. These efforts include a dataset of algorithmic reasoning tasks, to be used as a bootstrapping basis, as well as formalising the theoretical connection between (graph) neural networks and dynamic programming using the tools of category theory and abstract algebra.

        Speaker: Petar Velickovic (Deepmind / University of Cambridge)
      • 10:30 AM
        coffee break
      • 11
        Breakout sessions
    • 12
      Lunch at cafeteria

      Lunch at university cafeteria "Restaurant CESFO Plateau" https://goo.gl/maps/HcQN4yHtwHauVFfo8 across the street.
      Open 11:30-1:45, rush hour 12:15-13 to be avoided. Ticket available in bowl next to Sabrina's office in the cathedral.

    • Representation Learning workshop: Wed afternoon
      Convener: Andreas Salzburger (CERN)
      • 13
        Rediscovering orbital mechanics with Machine Learning

        We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a "graph neural network" to simulate the dynamics of our solar system's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton's law of gravitation. The key assumptions that were required were translational and rotational equivariance, and Newton's second and third laws of motion. Our approach correctly discovered the form of the symbolic force law. Furthermore, our approach did not require any assumptions about the masses of planets and moons or physical constants. They, too, were accurately inferred through our methods. Though, of course, the classical law of gravitation has been known since Isaac Newton, our result serves as a validation that our method can discover unknown laws and hidden properties from observed data. More broadly this work represents a key step toward realizing the potential of machine learning for accelerating scientific discovery.

        Finally, I will introduce “Learning the Universe”, a project aiming to use Machine Learning, and simulations of the Universe, to further our knowledge of dark matter, dark energy, and fundamental physics.

        Speaker: Pablo Lemos (University of Sussex)
      • 3:30 PM
        Coffee break
      • 14
        Break-out session
      • 15
        Conclusions from breakout sessions
    • Dealing with Uncertainties workshop: Thursday morning
      Convener: Yann COADOU (CPPM, Aix-Marseille Université, CNRS/IN2P3)
    • 19
      Lunch at the cafeteria

      Lunch at university cafeteria "Restaurant CESFO Plateau" https://goo.gl/maps/HcQN4yHtwHauVFfo8 across the street.
      Open 11:30-1:45, rush hour 12:15-13 to be avoided. Free tickets available in bowl next to Sabrina's office in the cathedral.

    • Dealing with Uncertainties workshop: Thursday afternoon
      • 20
        Breakout session : MAPIE on HEP data
      • 3:30 PM
        Coffee break
      • 21
        Breakout sessions : public dataset for evaluation of systematics uncertainties strategies
    • 22
      Dealing with uncertainties Social Dinner at the Gramophone, Orsay

      Very close to Orsay-Ville station.

    • Dealing with Uncertainties workshop: Friday morning
      Convener: Anja Butter (ITP Heidelberg)
      • 23
        Uncertainties in Deep Learning

        Deep learning algorithms, based on deep neural networks, have been deployed for numerous applications over the past few years, especially in the fields of image processing and natural language processing. Their relevance are now studied for scientific applications, for instance, as new methods to solve inverse problems or as surrogate models to accelerate computations in complex simulations. In this framework, it is necessary to be able to provide a quantification of the uncertainty on the outputs given by these methods. However, in their conventional form, deep neural networks are used as deterministic algorithms and they do not raise uncertainty estimations on their predictions. Recent works are developed to address this problematic and the scientific literature provides some technics to model and estimate these uncertainties.

        In this talk, we present a general overview of the state-of-the-art on this topic. We firstly introduce the notions and definitions for uncertainties and their origins in the case of machine learning predictions. Then, we present the main techniques that exist in the literature to quantify these uncertainties for deep neural networks and their possible limitations. Finally, we provide some elements to establish a methodology in order to validate and calibrate the uncertainties given by these methods.

        Speaker: Geoffrey Daniel (CEA)
      • 10:30 AM
        coffee break
      • 24
        Breakout sessions
    • 25
      Lunch at cafeteria
    • Dealing with Uncertainties workshop: Friday afternoon
      • 26
        Simulation-based inference: a cautionary tale

        Simulation-based inference (SBI) enables approximate Bayesian inference when high-fidelity computer simulators are used to describe phenomena of interest. SBI algorithms have quickly developed and matured over the past years and are already in use across many domains of science such as particle physics, astronomy, cosmology, neuroscience or epidemiology.
        Inference, however, remains only approximate, and any downstream analysis depends on the trustworthiness of the approximate posterior. In this talk, we will review the Bayesian inference methodology in the likelihood-free setting and discuss good practices to follow, including the choice of the prior, the choice of the SBI algorithm, and diagnostics that we can use to validate inference results.

        Speaker: Gilles Louppe (University of Liège)
      • 3:30 PM
        Coffee break
      • 27
        Break-out sessions conclusion
    • Generative Models workshop: Monday morning
      • 28
        Generator Workshop introduction
      • 29
        Overview of generator model for detector simulation

        Detailed simulations of physics processes are a cornerstone of most physics measurements and searches at experiments such as those at the Large Hadron Collider (LHC) as well as in other scientific domains. However, the large volumes of data collected at the LHC precludes a requirement for even more simulated data in order to test various hypotheses to a high precision. As the LHC moves to the era of high luminosity, the large scale produciton of simulated particle physics collisions becomes even more important.
        One of the bottlenecks in the generation of these data is the time required to simulate the response of the detector to particles created in collisions. This is in particular the case for calorimeter systems, which capture the energy of showers of particles.
        The current state of the art simulations are performed using the Geant 4 toolkit, which provides a detailed simulation of the interaction of each individual particle with the detector material. This is a very time and CPU intensive process which requires a detailed description of the detector. For example, simulations of particle physics collision events in the ATLAS detector can take several minutes per event.
        Fast simulation methods have been used by experiments for many years to reduce the required CPU time, with several non machine learning approaches actively used by the experiments at the LHC. Although these fast simulation methods often perform well, they are still far from the levels of the detailed simulation and efforts to improve them are ongoing. With the advent of deep generative modelling interest in bringing modern techniques to solve this challenge has grown, with the aim to improve upon the current fast simulation approaches. Deep generative modelling has shown great success in other domains, and the hope is to provide a simulation almost as accurate and varied as Geant 4, but without the required CPU time at inference.
        In this talk an outline of the problem and its challenges will be presented, followed by an overview of several recent approaches and an outlook on future developments.

        Speaker: Johnny Raine (Université de Genève)
      • 30
        Generator model for 4-momenta events

        Event generation and Hadronization make up a significant fraction of our high energy physics simulation chain. Therefore there exists a strong interest in using generative models to supplement these simulation tasks. In both applications the data is commonly expressed as a list of 4-momenta. This presents an additional challenge to any generative model, as 4-momenta have inherent complex correlations that need to be correctly reproduced.

        Several generative models for simulating 4-momenta are presented and discussed, with special focus on how the problem of complex correlations is addressed.

        Speaker: Sascha Diefenbacher (Universität Hamburg)
      • 11:00 AM
        Coffee Break
      • 31
        Break out session in the small amphi
    • 32
      Welcome buffet
    • Generative Models workshop: Monday afternoon
      Convener: Cécile Germain (LISN, Université Paris-Saclay)
      • 33
        GNNs for generating molecules and PDE solutions

        Graph Neural Networks (GNNs) have proven to be a versatile tool to predict properties of molecules, generative molecules and even predict solutions of a partial differential equation (PDE). Many physical application domains also exhibit symmetries which can be incorporated into the GNNs through equivariant convolutions or data augmentation. In this talk I will explain how this tool can be leveraged to generate molecules from their equilibrium distribution, possibly conditioned on some properties, and even predict solutions of a PDE.

        Speaker: Max Welling (U. Amsterdam / MSR)
      • 3:30 PM
        Coffee break
      • 34
        Breakout sessions
    • 35
      Happy hour Cathedral (IPA)



    • Generative Models workshop: Tuesday morning
      • 36
        The calorimeter challenge and ODD detector

        There has been many recent developments in the application of machine-learning techniques to fast simulation of cascades in the calorimeters. This is usually the most time consuming part of the event simulation in high energy physics experiments. Most current efforts are focused and fine tuned to specific detectors, which makes it difficult to compare. We present a first fast calorimeter simulation challenge, with a three-difficulty level dataset. The purpose of this challenge is to spur the development and benchmarking of fast and high-fidelity calorimeter shower generation using deep learning methods. It will be possible to directly compare new deep learning approaches on common benchmarks. It is expected that participants will make use of cutting-edge techniques in generative modeling with deep learning, e.g. GANs, VAEs and normalizing flows.
        As a follow up to this challenge, we would like to implement a benchmark calorimeter, within the Open Data Detector (ODD). Tracking system is already implemented in the ODD, and is an evolution of the successful detector from the Tracking Machine Learning Challenges. The calorimeter system will allow not only to benchmark the fast simulation of particle cascades, but creates unique opportunities for development and comparison of different reconstruction techniques, and particle flow methods.

        Speaker: Anna Zaborowska (CERN)
      • 10:30 AM
        Coffee break
      • 37
        Breakout sessions
    • 38
      Lunch at the cafeteria

      Lunch at university cafeteria "Restaurant CESFO Plateau" https://goo.gl/maps/HcQN4yHtwHauVFfo8 across the street.
      Open 11:30-1:45, rush hour 12:15-13 to be avoided. Ticket available in a bowl next to Sabrina's office in the cathedral.

    • Generative Models workshop: Tuesday afternoon
      • 39
        Meta-learning for fast simulation of multiple calorimeter responses

        In LHC experiments, the calorimeter is a key detector technology to measure the energy of particles. These particles interact electromagnetically and/or hadronically with the material of the calorimeter, creating cascades of secondary particles or showers. Describing the showering process relies on simulation methods that precisely describe all particle interactions with matter. Constrained by the need for precision, the simulation is inherently slow and constitutes a bottleneck for physics analysis. Furthermore, with the upcoming high luminosity upgrade of the LHC with more complex events and a much increased trigger rate, the amount of required simulated events will increase. Several research directions investigated the use of Machine Learning based models to accelerate particular calorimeter response simulation. This results in a specifically tuned simulation and generally these models require a large amount of data for training. Meta learning has emerged recently as fast learning algorithm using small training datasets. In this work, we use a meta-learning model that “learns to learn” to generate showers using a first-order gradient based algorithm. This model is trained on multiple calorimeter geometries and can rapidly adapt to a new geometry using few training samples.

        Speaker: Dalila Salamani (CERN)
      • 40
        Breakout sessions
      • 3:30 PM
        Coffee break
    • AI and physics conference: Day 1 Morning
      Conveners: Jean-Roch Vlimant (California Institute of Technology (US)), Mr David Rousseau (IJCLab, Orsay, France)
      • 41
        Conference Introduction from organisers
      • 42
        Purely Data-Driven Approaches to Weather Prediction: Promise and Perils

        The use of machine learning (ML) models for weather prediction has emerged as a popular area of research. The promise of these models — whether in conjunction with more traditional Numerical Weather Prediction (NWP), or on its own — is that they allow for more accurate predictions of the weather at significantly reduced computational cost. In this talk, I will discuss both the promise and perils of using a data-driven (and more specifically deep learning) only approach. I will focus on a recent project with the Met Office on precipitation nowcasting as a case study. While we found that we could create a deep learning model that was significantly preferred by Met Office meteorologists and performed well on objective measures of performance, we also discovered many ways in which deep learning systems can perform well on objective measures of performance without improving decision-making value. I will discuss some reasons why this failure mode occurs, while also advocating for better verification of purely data-driven models. Although the discussion focuses on very-short-term prediction, we believe that many of these lessons are also applicable to longer-term forecasts.

        Speaker: Suman Ravuri (Deepmind)
      • 43
        How to explain the success in scikit-learn and how to make it last

        The scikit-learn software has now been cited more than 50000 times in 10 years. It's the most used software by machine learning experts on kaggle. One considers that about 2 millions of data scientists are using it every month. Yet this was made possible with limited resources and mostly by researchers and engineers in academia. In this talk I will first list the reasons that can explain this immense success.
        Then I will review some recent efforts in the team in order to keep scikit-learn a leading software in the field.

        Speaker: Alexandre Gramfort (INRIA)
      • 10:30 AM
        Coffee break
      • 44
        Origins of AI in High Energy Physics from Orsay to Chicago – and back!

        The world’s first paper on AI in High Energy Physics was written right here in Orsay, 35 years ago. In this presentation, its author outlines the scientific contributions of the article and their impact; gives a historical account of how it came to be written; and presents an overview of what it was like working on AI in HEP in the early days of the field.

        Speaker: Bruce Denby (ESPCI)
      • 45
        Summary of Dealing with Uncertainties workshop
        Speaker: Michael Aaron Kagan (SLAC)
    • 46
      Lunch : Buffet in Institut Pascal hall
    • AI and physics conference: Day 1 Afternoon
      Conveners: Cécile Germain (LISN Université Paris-Saclay), Andreas Salzburger (CERN)
      • 47
        Data-centric AI

        Data-Centric Artificial Intelligence (DCAI) was introduced by Andrew Ng in 2021. This call was launched to answer the discrepancy of research contributions which almost always focus on models while considering fixed and well engineered data. However, in practice data-scientists spend most of their time on data engineering tasks.
        Through this presentation, we will start by giving a broad overview of the challenges covered by DCAI such as the creation, transformation and evaluation of data. Then, in a second part we will focus on the creation aspect and present the opportunity to couple Active Learning and Physics Simulations to automate the creation of datasets."

        Speaker: Romain Egele (Université Paris Saclay / Argonne National Laboratory)
      • 48
        Application of Self-organizing maps in high energy physics

        Self-Organizing-Maps (SOM) are widely used neural nets for unsupervised learning and dimensional reduction. They have not yet been applied in high energy physics. We discuss two applications of SOM in particle physics. First, the separation of physics processes in regions of the dimensionally reduced representation. Second, we obtain Monta Carlo scale factors by fitting templates to the dimensionally reduced representation. We use the ATLAS Open Data dataset as an example.

        Speaker: Kai Habermann (University of Bonn)
      • 49
        Super-resolution for calorimetry

        Super-resolution algorithms are commonly used to enhance the granularity of an imaging system beyond what can be achieved using the measuring device.
        We show the first application of super-resolution algorithms using deep learning-based methods for calorimeter reconstruction using a simplified geometry consisting of overlapping showers originated by charged and neutral pions events.
        The task of the presented ML algorithms is to estimate the fraction of charged and neutral energy components for each cell of the super-calorimeter, which represents the reconstructed calorimeter system whose granularity is up-scaled up to a factor of 4 compared to the original one. We show how the finer granularity can be used to unveil effects that would remain otherwise elusive, such as the reconstructed mass of the pi0 which is strictly connected to an unbiased estimation of the opening angle between the two photons. The performance is evaluated using several ML algorithms, including graph- and convolutional-neural networks.

        Speaker: Francesco Di Bello (INFN and U. of Rome)
      • 50
        Inducing selective posterior-collapse in variational autoencoders by modelling the latent space as a mixture of normal-gamma distributions

        Variational autoencoders are deep generative networks widely used for a large area of tasks, such as image or text generation. They are composed of two sub-models. On the one hand, the encoder aims to infer the parameters of the approximate posterior distribution $\mathcal{N}(z;x,\mu(\phi),\sigma(\phi))$ of a low dimensional latent vector $z$ that represents the generative factors of the input data $x$. On the other hand, the decoder is intended to model the likelihood of the input data $\mathcal{N}(x;z,\mu(\theta),\sigma(\theta))$. The parameters of the model, $\phi, \theta$ are jointly optimized through the maximization of a lower bound of the evidence called the ELBO. A major challenge is to get a disentangled and interpretable latent space in the aim to improve the field of representation learning. However, the vanilla variational autoencoder suffers from many issues, such as entangled latent space and posterior-collapse. These problems are all the more accentuated so as the dimension of the latent space is not well chosen. The goal of our work is to propose a model able to infer a disentangled latent space by taking advantage of a selective posterior-collapse process. Indeed, it can be observed that the variances inferred by the encoder for each latent variable have very different values depending on the information carried by the latter. More precisely, variables that contain a lot of information about the data distribution tend to have a low inferred variance contrary to the others. To leverage this property, we propose a variational autoencoder model which is favored to locate the information in a reduced number of latent variables and not to use the others. In this way, the dimension of the latent space is automatically adjusted to the complexity of the data.
        In order to do this, the latent variables of the autoencoder are augmented with their inverse variances which are also assumed unknown. Their joint posterior distribution is defined as a mixture of normal-gamma probability density functions
        $p_i NormalGamma(z_i,\lambda_i ; \mu_i, \alpha_1, \beta_2)+(1-p_i)NormalGamma(z_i,\lambda_i ; \mu_i, \alpha_2, \beta_2)$, where for the $i^text{th}$ latent variable $z_i$, $\lambda_i$ stands for its inverse variance and $\mu_i$ is directly inferred by the encoder as well as $p_i$. The other hyperparameters are defined so that the inverse variances take high values when the encoded variable carries information and are close to 1 otherwise. In this way, we add prior information that fits our assumptions and force the model to encode information in a subset of the latent space by doing a “selective posterior-collapse”. To optimize the parameters $\phi, \theta$, the objective function has to be modified to take into account the model mixture distribution, such as $ELBO_{NG} = \mathbb{E}_{q_{\phi(z, \lambda | x)}}[log p_\theta(x|z, \lambda)] - KL[q_\phi(z,\lambda|x) || p(z)p(\lambda)]$ where $\lambda$ is a vector that gathers all the inverse variances. A reparametrization trick is also proposed for the stochastic vectors $\lambda$ and $z$ in order to use the algorithm of stochastic gradient descent for the optimization. Our model, the Normal-Gamma VAE (NG-VAE), was tested on datasets with known factors of generation. We set the latent space dimension as highly superior to the number of these factors and validated the selective posterior-collapse process and disentanglement of the latent variables.

        Speaker: Emma JOUFFROY (CEA & IMS)
      • 51
        Deep Multi-task Mining Calabi-Yau Manifolds

        Computing topological properties of Calabi-Yau manifolds is a challenging mathematical task. Recent years have witnessed the rising use of deep learning as a method for exploration of large sets of data, to learn their patterns and properties. This is specifically interesting when it comes to unravel complicated geometrical structures, as well as in the development of trustworthy AI methods. Motivated by their distinguished role in string theory for the study of compactifications, we compute the Hodge numbers of Complete Intersection Calabi-Yau manifolds using deep neural networks. Specifically, we introduce a regression architecture, based on GoogleNet and multi-task learning, capable of mining information to produce highly accurate simultaneous predictions. This shows the potential of deep learning to learn from geometrical data, and it proves the versatility of architectures developed in different contexts.

        Speaker: Riccardo Finotello (CEA LIST, CEA ISAS)
      • 3:30 PM
        Coffee break
      • 52
        Scientific inference with imperfect theories: examples with machine learning and neurosciences

        Science has progressed by reasoning on what models could not predict because they were missing important ingredients. And yet without correct models, standard statistical methods for scientific evidence are not sound. I will argue that machine-learning methodology provides solutions to ground reasoning about empirically evidence more on models’ predictions, and less on their ingredients. I will draw examples from the history of physics and ongoing work in neuroscience, highlighting patterns in the back and forth between data and theory.

        Speaker: Gael Varoquaux (INRIA)
      • 53
        Study of model construction and the learning for hierarchical models

        To efficiently solve a big problem by deep learning, it is sometimes useful to decompose it into smaller blocks, enabling us to introduce our knowledge into the model by utilizing an appropriate loss function for each block.
        A simple model decomposition, however, causes a performance decrease due to bottlenecks of transferred information induced by the loss definition.
        We proposed a method to mitigate such a bottleneck by using hidden features instead of outputs that are defined for the loss function, and experimentally demonstrated the usefulness using a particle physics dataset.
        We also demonstrated the adaptive tuning of loss coefficients of each task based on techniques in multi-task learning.

        Speaker: Masahiko Saito (International Center for Elementary Particle Physics, University of Tokyo)
      • 54
        Uncertainty Aware Learning for High Energy Physics With A Cautionary Tale

        Machine learning tools provide a significant improvement in sensitivity over traditional analyses by exploiting subtle patterns in high-dimensional feature spaces. These subtle patterns may not be well-modeled by the simulations used for training machine learning methods, resulting in an enhanced sensitivity to systematic uncertainties. Contrary to the traditional wisdom of constructing an analysis strategy that is invariant to systematic uncertainties, we study the use of a classifier that is fully aware of uncertainties and their corresponding nuisance parameters. We show on two datasets that this dependence can actually enhance the sensitivity to parameters of interest compared to baseline approaches. Finally, we provide a cautionary example for situations where uncertainty mitigating techniques may serve only to hide the true uncertainties.

        Speaker: Aishik Ghosh (UC Irvine)
      • 55
        Graph Neural Networks for track reconstruction at HL-LHC

        The physics reach of the HL-LHC will be limited by how efficiently the experiments can use the available computing resources, i.e. affordable software and computing are essential. The development of novel methods for charged particle reconstruction at the HL-LHC incorporating machine learning techniques or based entirely on machine learning is a vibrant area of research. In the past years, algorithms for track pattern recognition based on graph neural networks (GNNs) have emerged as a particularly promising approach. We present new algorithms that can handle complex realistic detectors and achieve tracking efficiency and purity similar to production tracking algorithms based on Kalman filters. Crucially for HL-LHC and future collider applications, the pipeline benefits significantly from GPU acceleration, and its computational requirements scale close to linearly with the number of particles in the event.

        Speaker: Alexis VALLIER (Laboratoire des 2 Infinis - Toulouse, CNRS / Univ. Paul Sabatier (FR))
      • 56
        Particle Tracking with Graph Neural Networks

        Each proton-proton collision event at the LHC produces a myriad of particles and interactions that are recorded by specialized detectors. Trackers are designed to sample the trajectories of these particles at multiple space-points; tracking is the connecting-the-dots process of linking these signals (hits) to reconstruct particle trajectories (tracks). Tracker data is naturally represented as a graph by assigning hits to nodes and edges to hypothesized particle trajectories. Several studies show that edge-classifying graph neural networks (GNNs) can be used to reject unphysical edges from the graph, yielding a set of disjoint subgraphs corresponding to individual tracks. In this work, we present an extension of this approach using object condensation, a set of truth definitions and loss functions designed to cluster hits belonging to the same particle and, subsequently, predict the properties of each cluster. Specifically, we apply a message-passing GNN to perform edge-classification, leverage edge-classification results to cluster tracker hits, and predict the kinematic features of the tracks formed by each hit cluster. Key results will be shown at each stage of this pipeline, including graph construction, edge classification performance, clustering performance, noise rejection, and track property prediction.

        Speaker: Gage DeZoort (Princeton University)
      • 57
        Jet Energy Corrections with GNN Regression

        Accurate energy measurements of jets are crucial to many analyses in particle physics. To improve the performance of current jet energy corrections, machine learning based methods are investigated. Following recent developments in jet flavor classification, jets are considered as unordered sets of their constituent particles, referred to as particle clouds. In addition, particular care is taken on having similarly distributed data for different jet flavors in the training sample enabling generically applicable corrections. The set-based Particle Flow Network and the graph-based ParticleNet are then applied to perform regression in an attempt to map reconstructed $p_T$ towards its particle-level counterpart. The performance of the two models is compared internally, however, compared to standard corrections both of them yield significant improvement in both energy resolution and flavor dependence.

        Speaker: Daniel Holmberg (CERN)
    • AI and physics conference: Day 2 morning
      Conveners: François Lanusse (CEA Saclay), Eilam Gross (Weizmann)
      • 58
        Experiment optimisation with differentiable programming

        In 2012 the imagenet challenge and the discovery of the Higgs boson produced a paradigm shift in particle physics analysis. Today a new revolution awaits to be made. The possibility to map continuously the space of design solutions of even the hardest optimization problem, using differentiable programming, promises to provide us with entirely new and more performant or cheaper solutions to particle detection. I will look at the status of this research development and provide a few examples to prove its potential.

        Speaker: Tommaso Dorigo (INFN Padova)
      • 59
        Astronomical source separation with variational autoencoders

        Upcoming surveys such as the Large Survey of Space and Time (LSST) and Euclid will observe the sky with unprecedented depth and area of coverage. As these surveys will detect fainter objects, the increase in object density will lead to an increased number of overlapping sources. For example, in LSST we expect around 60% of the objects to be blended. In order to better constrain Dark Energy parameters, mapping the matter content of our Universe with weak gravitational lensing is one of the main probes for the upcoming large cosmological surveys, and for these analyses, the blending effect is expected to be one of the major systematics to face. Classical methods for solving the inverse problem of source separation, so-called “deblending”, either fail to capture the diverse morphologies of galaxies or are too slow to analyze billions of galaxies. To overcome these challenges, we propose a deep learning-based approach to deal with the size and complexity of the data.

        In the context of the Dark Energy Science Collaboration (DESC), we have developed a Python package called DebVader, which uses a modified version of Variational Autoencoders (VAEs) for deblending. First, isolated galaxies are used to train a VAE to learn their latent space representations which are then used as a prior for deblending galaxies from a blended scene.

        We have tested the performance of our algorithm using realistic simulations generated by the blending-task force of the LSST-DESC collaboration. With flux recovery as a metric, we observe that the errors are comparable to the state-of-the-art while still being fast and scalable. In this context, I will be demonstrating the performance of DebVader for deblending galaxy fields for Rubin and how we expect to shift from simulations to real data which is expected within the next few years.

        Speaker: Biswajit Biswas (Laboratorie Astroparticule et cosmologie (APC))
      • 60
        Machine Learning for Real-Time Processing of ATLAS Liquid Argon Calorimeter Signals with FPGAs

        The Phase-II upgrade of the LHC will increase its instantaneous luminosity by a factor of 7 leading to the High Luminosity LHC (HL-LHC). At the HL-LHC, the number of proton-proton collisions in one bunch crossing (called pileup) increases significantly, putting more stringent requirements on the LHC detectors electronics and real-time data processing capabilities.

        The ATLAS Liquid Argon (LAr) calorimeter measures the energy of particles produced in LHC collisions. This calorimeter has also trigger capabilities to identify interesting events. In order to enhance the ATLAS detector physics discovery potential, in the blurred environment created by the pileup, an excellent resolution of the deposited energy and an accurate detection of the deposited time is crucial.

        The computation of the deposited energy is performed in real-time using dedicated data acquisition electronic boards based on FPGAs. FPGAs are chosen for their capacity to treat large amount of data with very low latency. The computation of the deposited energy is currently done using optimal filtering algorithms that assume a nominal pulse shape of the electronic signal. These filter algorithms are adapted to the ideal situation with very limited pileup and no timing overlap of the electronic pulses in the detector. However, with the increased luminosity and pileup, the performance of the filter algorithms decreases significantly and no further extension nor tuning of these algorithms could recover the lost performance.

        The back-end electronic boards for the Phase-II upgrade of the LAr calorimeter will use the next high-end generation of INTEL FPGAs with increased processing power and memory. This is a unique opportunity to develop the necessary tools, enabling the use of more complex algorithms on these boards. We developed several neural networks (NNs) with significant performance improvements with respect to the optimal filtering algorithms. The main challenge is to efficiently implement these NNs into the dedicated data acquisition electronics. Special effort was dedicated to minimising the needed computational power while optimising the NNs architectures.

        Five NN algorithms based on CNN, RNN, and LSTM architectures will be presented. The improvement of the energy resolution and the accuracy on the deposited time compared to the legacy filter algorithms, especially for overlapping pulses, will be discussed. The implementation of these networks in firmware will be shown. Two implementation categories in VHDL and Quartus HLS code are considered. The implementation results on Stratix 10 INTEL FPGAs, including the resource usage, the latency, and operation frequency will be reported. Approximations for the firmware implementations, including the use of fixed-point precision arithmetic and lookup tables for activation functions, will be discussed. Implementations including time multiplexing to reduce resource usage will be presented. We will show that two of these NNs implementations are viable solutions that fit the stringent data processing requirements on the latency (O(100ns)) and bandwidth (O(1Tb/s) per FPGA) needed for the ATLAS detector operation.

        Speaker: Nairit SUR (CPPM - CNRS/IN2P3)
      • 61
        Looking for rare di-Higgs events at the LHC with Machine (Deep) Learning techniques

        Artificial intelligence (AI) algorithms applied to HEP analyses come to the rescue in scenarios in which implementing an efficient discriminant for separating a very low-rate signal from a huge background is extremely important. In this context, we investigate the usage of several Machine Learning (ML) and Deep Learning (DL) methods via the TensorFlow open-source platform to boost the sensitivity to double Higgs boson (HH) simulated events produced via vector-boson fusion (VBF) mechanism VBF in the 4 charged lepton + 2 b-jets final state at generator level.
        This particle physics process had not yet been investigated at the Large Hadron Collider (LHC) experiment mainly due to the small value of its cross-section weighted with the branching ratios BRs (with the Higgs mass set to its best fit value of 125.09 GeV and at the center-of-mass energy of $\sqrt{s}$ = 13 TeV, the former is ∼ 1.723 fb and the corresponding BRs are 2.79 $\times$ 10$^{-4}$ for H → ZZ$^{*}$ → 4l, with l = e, μ, τ, and 5.75 $\times$ 10$^{−1}$ for H → $b\overline{b}$), thus requiring an exclusive event selection in order to efficiently perform a background rejection. This work uses the VBF HH rare physics process to show the advantages of AI algorithms' highly parallelizable implementation and to present their discriminant performance results in terms of several ML evaluation metrics. In this way, we propose a wider application of these multivariate analysis tools to current LHC analyses and generally to datasets potentially enriched with high-purity contributions from rare physics processes.

        Speaker: Brunella D'Anzi (Universita e INFN, Bari (IT))
      • 10:30 AM
        Coffee break
      • 62
        An Imperfect Machine to search for New Physics: dealing with uncertainties in a machine-learning based signal extraction

        New Physics Learning Machine (NPLM) is a novel machine-learning based strategy to detect data departures from the Standard Model predictions, with no prior bias on the nature of the new physics responsible for the discrepancy. The main idea behind the method is to build the log-likelihood-ratio hypothesis test by translating the problem of maximizing the log-likelihood-ratio into the minimization of a loss function [1, 2].
        NPLM has been recently extended in order to deal with the uncertainties of the Standard Model predictions. The new formulation directly builds on the specific maximum-likelihood-ratio treatment of uncertainties as nuisance parameters, that is routinely employed in high-energy physics for hypothesis testing [3].
        In this talk, after outlining the conceptual foundations of the algorithm, I will describe the procedure to account for systematic uncertainties and I will show how to implemented it in a multivariate setup by studying the impact of two typical sources of experimental uncertainties in two-body final states at the LHC.

        Speaker: Gaia Grosso (CERN)
      • 63
        Neural Empirical Bayes: Source Distribution Estimation and its Applications to Simulation-Based Inference

        We study the problem of retrieving a truth distribution from noisy observed data, often referred to as unfolding in High Energy Physics, which facilitates comparisons with theoretical predictions, and thus aids the process of scientific discovery. Our simulation-based inference technique, which we call Neural Empirical Bayes (NEB), combines Empirical Bayes, also known as maximum marginal likelihood, with deep generative modeling. This approach allows us to unfold continuous multidimensional distributions, in contrast to traditional approaches that treat unfolding as a discrete linear inverse problem. We show that domain knowledge can be easily introduced into our method, which is highly beneficial for efficiently finding solutions to these ill-posed inverse problems. We demonstrate the applicability of NEB in the absence of a tractable likelihood function, which is typical of scientific fields relying on computer simulations.

        Speaker: Maxime Vandegar (SLAC National Accelerator Laboratory)
      • 64
        Renormalized Mutual Information for Artificial Scientific Discovery

        We derive a well-defined renormalized version of mutual information that allows us to estimate the dependence between continuous random variables in the important case when one is deterministically dependent on the other. This is the situation relevant for feature extraction, where the goal is to produce allow-dimensional effective description of a high-dimensional system. Our approach enables the discovery of collective variables in physical systems, thus adding to the toolbox of artificial scientific discovery, while also aiding the analysis of information flow in artificial neural networks.

        Speaker: Leopoldo Sarra (Max Planck Institute Science of Light)
      • 65
        Summary of Representation workshop
        Speaker: Savannah Thais
    • 66
      Lunch : buffet in Institut Pascal hall
    • AI and physics conference: Day 2 afternoon
      Convener: Savannah Thais (Princeton)
      • 67
        The five walls of Artificial Intelligence

        Artificial intelligence is advancing at a very rapid pace in both research and applications, and is raising societal questions that are far from being answered. But as it moves forward rapidly, it runs into what I call the five walls of AI. Any one of these five walls is capable of halting its progress, which is why it is essential to know what they are and to seek answers in order to avoid the so-called third winter of AI, a winter that would follow the first two in the years 197x and 199x, during which AI research and development came to a virtual standstill for lack of budget and community interest. The five walls are those of trust, energy, safety, human interaction, and inhumanity. They contain a number of ramifications, and obviously interact. I will present them in sequence and discuss some avenues to avoid a fatal outcome for AI.

        Speaker: Bertrand Braunschweig (BiLaB)
      • 68
        AI in science: the need to increase research productivity

        In recent years a number of scholars – mainly economists -have argued that the productivity of science may be stagnating, or even in decline. The claim is not that science is failing to advance, but rather that outputs require ever more inputs (to the extent that scientific output occurs in any discrete way). If true, the consequences of any slowdown in the productivity of science could be major. Among other things, governments, already under acute budgetary pressures, might have to spend ever more to achieve existing rates of growth of useful science. For investments in science equivalent to today’s, ever fewer increments of new knowledge will be available with which, over time, to spur currently stagnating economic productivity growth, an increase in which will become more critical as OECD populations age. In addition, timeframes might lengthen for achieving scientific progress needed to address urgent challenges, from new contagions, to novel crop diseases and sources of environmental stress. In this connection, the question arises as to whether AI could help accelerate progress in science, for which reason the OECD has launched a project on this topic. Mr.Nolan, an economist specialised on science and technology policy, will present the relevant evidence, consider how AI is being used across science, and outline the possible implications for policy.
        A forthcoming OECD book draws on a week-long workshop - AI and the Productivity of Science (29 October - 5 November 2021). A video of the workshop is viewable by day, session and individual presentation (kindly find the link here : https://www.oecd.org/sti/inno/ai-productivity-of-science.htm).

        Speaker: Alistair Nolan (OECD)
      • 69
        ALICE - non parametric and parametric models, dealing with uncertainty

        ALICE, one of the four big experiments at the CERN LHC, is a detector dedicated to heavy-ion physics. A high interaction rate environment causes pile-up which necessitates the use of advanced methods of data analysis.

        Over the recent years machine learning (ML) has come to dominate multi-dimensional data analysis. However, it is more difficult to interpret the ML models and to evaluate their uncertainties, which are offered by classical approaches.

        In this presentation, I will show how ML is used in ALICE for reconstruction, calibration, and MC simulations. In more detail, We will demonstrate how we combine ML with a parametric model, in order to yield a compact representation of physics processes. Our main use case is the calibration of space charge distortions, which requires estimates of reducible and irreducible uncertainties. We will demonstrate how this and other use cases (PID, V0 reconstruction, MD/data remapping) are solved with our approach and will describe the features of the software we developed.

        Speaker: Marian Ivanov (GSI Darmstadt and CERN)
      • 70
        Efficiency parametrization with Neural Networks

        An overarching issue of LHC experiments is the necessity to produce massive numbers of simulated collision events in very restricted regions of phase space. A commonly used approach to tackle the problem is the use of event weighting techniques where the selection cuts are replaced by event weights constructed from efficiency parametrizations. These techniques are however limited by the underlying dependencies of these parametrizations which are typically not fully known and thus only partially exploited.
        We propose a neural network approach to learn multidimensional ratios of local densities to estimate in an optimal fashion the efficiency. Graph neural network techniques are used to account for the high dimensional correlations between different physics objects in the event. We show in a specific toy model how this method is applicable to produce accurate efficiency maps for heavy flavor tagging classifiers in HEP experiments, including for processes on which it was not trained.

        The work is based on: https://arxiv.org/abs/2004.02665

        Speaker: Nilotpal Kakati (Weizmann Institute of Science)
    • 71
      AI and physics conference social dinner at Musée d'Orsay

      For attendees who have confirmed registration only.
      No later than 6:30PM at entrée B (river side) of Musée d'Orsay (reserved for groups). Please have your badge, or be with someone who has a badge and can vouch for you.
      Then, visit on your own, with audioguide.

      We reconvene no later than 8:30PM at the restaurant on the second floor.

      From Le Guichet RER B station, please count 50' : RER B to Paris, change at Saint-Michel, RER C to Musée d'Orsay (towards Versailles Rive Gauche, other directions work as well, but not Versailles-Chantiers!), best is to use the app Citymapper, Google Map works as well but less clear on directions. RER will still run after dinner.

    • AI and physics conference: Day 3 Morning
      Conveners: Jean-Roch Vlimant (California Institute of Technology (US)), Pietro Vischia (UC Louvain)
      • 72
        Highly accurate protein structure prediction with AlphaFold

        Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence — the structure prediction component of the ‘protein folding problem’ — has been an important open research problem for more than 50 years. AlphaFold, a novel machine learning approach developed at DeepMind, demonstrated accuracy competitive with experimental structures in a majority of cases and greatly outperformed other methods, and has since been recognized as "Method of The Year 2021" by Nature Methods. In this talk I will outline the problem, describe the AlphaFold method and our engineering approach, discuss applications in biology, and sketch possible future directions and connections.

        Speaker: Tim Green (Deepmind)
      • 73
        Sampling high-dimensional posterior by Neural Score Matching

        We present a novel methodology to address many ill-posed inverse problems, by providing a description of the posterior distribution, which enables us to get point estimate solutions and to quantify their associated uncertainties. Our approach combines Neural Score Matching for learning a prior distribution from physical simulations, and a novel posterior sampling method based on an annealed Hamiltonian Monte Carlo algorithm to sample the full high-dimensional posterior of our problem.

        In the astrophysical problem we address, by measuring the lensing effect on a large number of galaxies, it is possible to reconstruct maps of the Dark Matter distribution on the sky. But because of missing data and noise dominated measurement, the recovery of dark matter maps constitutes a challenging ill-posed inverse problem.

        We propose to reformulate the problem in a Bayesian framework, where the target becomes the posterior distribution of mass given the galaxies shape observations. The likelihood factor, describing how light-rays are bent by gravity, how measurements are affected by noise, and accounting for missing observational data, is fully described by a physical model. Besides, the prior factor is learned over cosmological simulations using a recent class of Deep Generative Models based on Neural Score Matching and takes into account theoretical knowledge. We are thus able to obtain samples from the full Bayesian posterior of the problem and can provide Dark Matter map reconstruction alongside uncertainty quantifications.

        We present an application of this methodology to the reconstruction of the HST/ACS COSMOS field and yield the highest quality convergence map of this field to date. We find the proposed approach to be superior to previous algorithms, scalable, providing uncertainties, using a fully non-Gaussian prior and promising for future weak lensing surveys (Euclid, LSST, RST). We also apply this framework to Magnetic Resonance Image (MRI) reconstruction, illustrating the diversity of inverse problems we can solve.

        - https://arxiv.org/abs/2011.08271
        - https://arxiv.org/abs/2011.08698

        Former talks:
        ML-IAP Conference: https://b-remy.github.io/talks/ML-IAP2021
        INP3 Machine Learning Workshop: https://b-remy.github.io/talks/in2p3_2021

        Speaker: Benjamin Remy (CEA Saclay)
      • 74
        Symbolic expression generation via VAE

        There are many problems in physics, biology, and other natural sciences in which symbolic regression can provide valuable insights and discover new laws of nature. A widespread Deep Neural Networks don’t provide interpretable solutions. Meanwhile, symbolic expressions give us a clear relation between observations and the target variable. However, at the moment, there is no dominant solution for the symbolic regression task, and we aim to reduce this gap with our algorithm. In this work, we propose a novel deep learning framework for symbolic expression generation via VAE. In a nutshell, we suggest using a variational autoencoder to generate mathematical expressions, and our training strategy forces generated formulas to fit a given dataset. Our framework allows encoding apriori knowledge of the formulas into fast-check predicates that can speed up the optimization process. We compare our method to modern symbolic regression benchmarks and show that our method outperforms the competitors on most of the tasks.

        Speakers: Sergei Popov (Higher School of Economics), Dr Mikhail Lazarev (Higher School of Economics)
      • 75
        Unbinned measurements in global SMEFT fits from machine learning

        Global interpretations of particle physics data in the context of the Standard Model Effective Field Theory (SMEFT) rely on the combination of a wide range of physical observables from many different processes. We present ongoing work towards the integration of unbinned measurements into such global SMEFT interpretations by means of machine learning tools. We use a deep-learning parametrisation of the extended log-likelihood to perform an optimal unbinned multivariate analysis in the EFT parameter space, taking model uncertainties into account via the replica method. We carry out a variant of the SMEFiT global analysis using unbinned particle-level predictions of top-quark pair production and Higgs production in association with vector bosons as a proof of concept. We demonstrate the impact that such measurements would have on the EFT parameter space as compared to traditional unfolded binned measurements.

        Speaker: Jaco ter Hoeve (Nikhef / VU Amsterdam)
      • 10:30 AM
        Coffee break
      • 76
        Merging Physical Models with Deep Learning for Cosmology

        With an upcoming generation of wide-field cosmological surveys, cosmologists are facing new and outstanding challenges at all levels of scientific analysis, from pixel-level data reduction to cosmological inference. As powerful as Deep Learning (DL) has proven to be in recent years, in most cases a DL approach alone proves to be insufficient to meet these challenges, and is typically plagued by issues including robustness to covariate shifts, interpretability, and proper uncertainty quantification, impeding their exploitation in scientific analysis.
        In this talk, I will instead advocate for a unified approach merging the robustness and interpretability of physical models, the proper uncertainty quantification provided by a Bayesian framework, and the inference methodologies and computational frameworks brought about by the Deep Learning revolution.
        In particular, we will see how deep generative models can be embedded within principled physical Bayesian modeling to solve a number of high-dimensional astronomical ill-posed inverse problems. I will also illustrate how the same generative modeling techniques can alleviate the need for analytic likelihoods in cosmological inference, enabling instead Simulation-Based Inference approaches. And finally, I will highlight the power of the computational frameworks initially developed for Deep Learning when applied to physical modeling, making codes from analytical likelihoods to large-scale cosmological simulators automatically differentiable.

        Speaker: François Lanusse (CEA Saclay)
      • 77
        AI and the meaning crisis

        How to bring value into AI : Valuelessness is an inherent and proudly upheld feature of the scientific approach. It made it successful for modelling the material world and to keep our biases under control. I will argue that, at the same time, it is one of the main obstacles for developing true AI that can be seamlessly integrated in society. I will shed light on this argument through my personal journey of putting together what I do with who I am, which is the very process we need to understand to make AI truly human-like. I will use the brilliant 4P metacognition framework of John Vervaeke to sketch a roadmap and to situate our work on action-oriented AI for optimizing engineering systems.

        Speaker: Balazs Kegl (Huawei France)
      • 78
        Unsupervised Learning Likelihood functions of LHC results

        Due to their undoubting importance, the systematic publication of Full Likelihood Functions (LFs) of LHC results is a very hot topic among the HEP community. Major steps have been taken towards this goal; a notable example being ATLAS release of full likelihoods with the pyhf framework. However, the publication of LFs remains a difficult challenge since they are generally complex and high-dimensional; this leads, for instance, to time consuming maximum likelihood estimations. Alternatively, we propose the unsupervised learning of LFs with Normalizing Flows (NFs), a powerful brand of generative models which include density estimation by construction. In this way, we can obtain compact and precise descriptions of complex LFs that can be efficiently sampled from and used for probability estimations. In this talk, we discuss the capability of NFs for modelling the full-likelihood functions of LHC-New Physics searches and -Standard Model measurements.

        Speaker: Humberto Reyes González (University of Genoa)
      • 79
        Online-compatible unsupervised nonresonant anomaly detection

        There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events—there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of nonresonant anomalies that includes both signal sensitivity and a data-driven method for background estimation. Our technique is built out of two simultaneously trained autoencoders that are forced to be decorrelated from each other. This method can be deployed off-line for nonresonant anomaly detection and is also the first complete on-line-compatible anomaly detection strategy. We show that our method achieves excellent performance on a variety of signals prepared for the ADC2021 data challenge.

        Speaker: Dr Vinicius Mikuni (Lawrence Berkeley National Lab. (US))
      • 80
        Thanks to Sabrina and Aurélie
    • 81
      Lunch : buffet in Institut Pascal hall
    • AI and physics conference: Day 3 afternoon
      Conveners: Felice Pantaleo (CERN), Anja Butter (ITP Heidelberg)
      • 82
        Summary of Generator workshop
        Speaker: Johnny Raine (Université de Genève)
      • 83
        Quantum Generative Models in High Energy Physics

        Theoretical and algorithmic advances, availability of data, and computing power have opened the door to exceptional perspectives for application of classical Deep Learning in the most diverse fields of science, business and society at large, and notably in High Energy Physics (HEP). Generative models, in particular, are among the most promising approaches to analyse and understand the amount of information the next generation HEP detectors will produce.

        Generative modeling is also a promising task for near-term quantum devices that can leverage compressed high dimensional representations and use the stochastic nature of quantum measurements as random source. Several architectures are being investigated. Quantum implementations of Generative Adversarial Networks (GAN) and Auto-Encoders, among the most popular classical approaches, are being proposed for different applications. Born machines are purely quantum models that can generate probability distributions in a unique way, inaccessible to classical computers.

        This talk will give an overview of the current state of the art in terms of generative modeling on quantum computers with focus on their application to HEP. Examples will include the application of Born machines and quantum GAN to the problem of joint and conditional distributions learning.

        Furthermore, experiments on the effect of intrinsic quantum noise on model convergence will be discussed and framed in the broader context of quantum machine learning.

        Indeed, while a number of studies have proven that noise plays a crucial role in the training of classical neural networks, near-term quantum hardware encounters the challenge to overcome the noise due to the gate errors, readout errors, and interactions with the environment. The presence of these intrinsic quantum noises suggests the possibility to replace the artificial noise in the context of classical machine learning with the noise of the quantum hardware.

        Speaker: Sofia Vallecorsa (CERN)
      • 84
        Quantum generative models for muonic force carriers events

        Generative models (GM) are promising applications for near-term quantum computers due to the probabilistic nature of quantum mechanics. In this work, we propose comparing a classical conditional generative adversarial network (C-GAN) approach with a Born machine while addressing their strengths and limitations to generate muonic force carriers (MFCs) events. The former uses a neural network as a discriminator to train the generator, while the latter takes advantage of the stochastic nature of measurements in quantum mechanics to generate samples. We consider a muon fixed-target collision between muons produced at the high-energy collisions of the LHC and the detector material of the ForwArd Search ExpeRiment (FASER) or the ATLAS calorimeter. In the ATLAS case, independent muon measurements performed by the inner detector (ID) and muon system (MS) can help observe new force carriers coupled to muons, which are usually not detected. In the FASER experiment, the high resolution of the tungsten/emulsion detector is used to measure the muons trajectories and energies. MFCs could potentially be part of dark matter (DM) and explain anomalies in the low-energy regime, making them attractive for physic searches beyond the standard model.

        Speaker: Oriel Orphee Moira Kiss (CERN, UNIGE)
      • 85
        ML-based Correction to Accelerate Geant4 Calorimeter Simulations

        The Geant4 detector simulation, using full particle tracking (FullSim), is usually the most accurate detector simulation used in HEP but it is computationally expensive. The cost of FullSim is amplified in highly segmented calorimeters where large fraction of the computations are performed to track the shower’s low-energy photons through the complex geometry. A method to limit the production of these photons is in the form of Geant4’s production cuts. Increasing the values of these production cuts reduces the accuracy of shower shapes in the simulation but can greatly increase the computational speed. We propose a post-hoc machine learning (ML) correction method for calorimeter cell energy depositions. The method is based on learning the density ratio between the reduced accuracy simulation and the nominal one to extract multi-dimensional weights using a binary classifier. We explore the method using an example calorimeter geometry from the International Large Detector project and showcase initial results. The use of ML to correct calorimeter cells allows for more efficient use of heterogeneous computing resources with FullSim running on the CPU while the ML algorithm applies the correction in an event-parallel fashion on GPUs.

        Speaker: Dr Evangelos Kourlitis (Argonne National Laboratory)
      • 3:30 PM
        Coffee break
      • 86
        Advances in Machine Learning Based Modeling and Control of Particle Accelerators at Scientific User Facilities

        Particle accelerators are used in a wide array of medical, industrial, and scientific applications, ranging from cancer treatment to understanding fundamental laws of physics. While each of these applications brings with them different operational requirements, a common challenge concerns how to optimally adjust controllable settings of the accelerator to obtain the desired beam characteristics. For example, at highly flexible user facilities like the Linac Coherent Light Source (LCLS) and FACET-II at the SLAC National Accelerator Laboratory, requests for a wide array custom beam configurations must be met in a limited window of time to ensure the success of each experiment — a task which can be difficult both in terms of tuning time and the final achievable solution quality.
        At present, the operation of most accelerator facilities relies heavily on manual tuning by highly-skilled human operators, sometimes with the aid of simplified physics models and local optimization algorithms. As a complement to these existing tools, approaches based on machine learning are poised to enhance our ability to achieve higher-quality beams, fulfill requests for custom beam parameters more quickly, and aid the development of novel operating schemes.
        I will discuss recent developments in using ML for online optimization, the creation of ML-enhanced virtual diagnostics to aid beam measurements, and the use of ML to create fast-executing online models (or "digital twins") of accelerator systems. These improvements could increase the scientific output of particle accelerator user facilities and enable new capabilities in creating custom charged particle beams. They could also help us to meet the modeling, design, and online optimization challenges that become more acute as we push toward the more difficult-to-achieve beam parameters that are desired for future accelerator applications (e.g. higher beam energies and intensities, higher stability, and extreme adjustments of the beam shape in phase space).

        Speaker: Auralee Edelen (SLAC National Accelerator Lab)
      • 87
        Simulating the LHCb experiment with Generative Models

        During Run 2 of the Large Hadron Collider at CERN, the LHCb experiment has spent more than 80% of the pledged CPU time to produce simulated data samples. The upcoming upgraded version of the experiment will be able to collect larger data samples, requiring many more simulated events to analyze the data to be collected in Run 3. Simulation is a key necessity of analysis to interpret signal vs background and measure efficiencies. The needed simulation will far exceed the pledged resources, requiring an evolution in technologies and techniques to produce these simulated samples.
        In this contribution, we discuss Generative Models powered by several algorithms and strategies to effectively parametrize the high-level response of the single components of the LHCb detector, encoding within neural networks the experimental errors and uncertainties introduced in the detection and reconstruction process. Where possible, models are trained directly on real data, resulting into a simulation process completely independent of the detailed simulation used to date. The models developed can then be pipelined into a consistent purely-parametric simulation framework, or used as single entities to complement the samples obtained with detailed simulation.

        Speaker: Matteo Barbetti (University of Florence and INFN-Firenze)
      • 88
        Generative models uncertainty estimation

        In recent years fully-parametric fast simulation methods based on generative models have been proposed for a variety of high-energy physics detectors. By their nature, the quality of data-driven models degrades in the regions of the phase space where the data are sparse. Since machine-learning models are hard to analyze from the physical principles, the commonly used testing procedures are performed in a data-driven way and can’t be reliably used in such regions. In our talk we propose three methods to estimate the uncertainty of generative models inside and outside of the training phase space region, along with data-driven calibration techniques. Test of the proposed methods on the LHCb RICH fast simulation is also presented.

        Speaker: Nikita Kazeev (HSE)
      • 89
        Generative models for scalar field theories: how to deal with poor scaling?

        The basis of lattice QCD is the formulation of the QCD path integral on a Euclidean
        space-time lattice, allowing for computing expectation values of observables using Monte
        Carlo simulations. Despite the success of lattice QCD in determinations of many parameters
        of the Standard Model, limitations on the current techniques and algorithms still exist,
        such as critical slowing down or the cost of fully taking into account the fermion dynamics.
        New approaches are required to circumvent these limitations. Machine learning algorithms
        provide a viable approach to address some of these difficulties. Deep generative models such
        as normalizing flows are suggested as alternatives to standard methods for generating lattice
        configurations. Previous studies on normalizing flows demonstrate proof of principle for
        simple models in two dimensions. However, further studies indicate that the training cost
        can be, in general, very high for large lattices. The poor scaling traits of current models
        indicate that moderate-size networks cannot efficiently handle the inherently multi-scale
        aspects of the problem, especially around critical points. In this talk, we explore current
        models that lead to poor acceptance rates for large lattices and explain how to use effective
        field theories as a guide to design models with improved scaling costs. Finally, we discuss
        alternative ways of handling poor acceptance rates for large lattices.

        Speaker: Javad Komijani (ETH Zurich)
      • 90
        Conclusion from the organisers
        Speaker: David Rousseau (IJCLab, Orsay, France)