A half-day event where NVIDIA representatives will describe the way they can provide support for the research teams at UPSaclay who are interested in using GPUs , and you can present, in a 10-15 minutes presentation, your research projects that could profit from GPUs support.
(Paris-Saclay Center for Data Science)
Scientific projects: Presentations - Session 1
Parallélisation des calculs sur serveur multi-GPUs pour la résolution de problèmes inverses10m
Les algorithmes itératifs utilisés lors de la résolution de problèmes inverses portant sur des gros volume de données requiert une accélération significative pour être utilisés en pratique.
Sur des exemples d'applications en tomographie X (reconstruction de volume 1024**3 voxels) et en déconvolution de signaux 1D (enregistrement sur plusieurs années de données spectrales de Mars) ou 2D (flux vidéo d'une webcam), nous présenterons notre recherche de solutions permettant la parallélisation des calculs la plus efficace possible sur les processeurs de type "many cores" que sont les GPUs. Nous exposeronsainsi la triple adéquation entre l'architecture des processeurs GPUs (CUDA de Nvidia), la (re)formulation des algorithmes et la (re)structuration des données que nous avons mis en oeuvre sur différentes types d'opérateurs utilisés dans les algorithmes itératifs (projecteur, rétroprojecteur, convolution nD). Par ailleurs, nous montrerons l'attention particulière qui doit être apportée au goulot
d'étranglement liés au temps de transfert entre le PC et la carte GPU.
Enfin, nous présenterons le découpage des données que nous avons effectué afin de bénéficier pleinement d'un serveur multi-GPUs et apporterons quelques éléments de réponse sur l'utilisation des GPUs couplés à Matlab et des librairies déjà existantes (CUBLAS, NVPP...).
Applied Generic Programming to Accelerator Programming10m
GPGPus and other form of accelerators are becoming a mainstream asset for high performance computing. Raising the programmability of such hardware is paramount to enable the maximum amount of users to discover, master and subsequently use accelerators in there day-to-day research activities.
This presentation showcase NT2 - the Numerical Template Toolbox - which is a C++ HPC library centered around the premises of simplifying access to complex parallel hardware to the mass. We'll first describe the positioning of such a project, how it handles accelerator programming on its own and conclude on some performance report.
Anatomical and Functional Visualization of Brain Connectivity10m
A fundamental way to analyze the brain is by studying brain connectivity, i.e. how brain regions are connected to each other. Two types of brain connectivity exist, anatomical connectivity and functional connectivity, each with their advantages and disadvantages. Being able to study both types of connectivity in concert in real time addresses a need of neuroscientists and neurosurgeons. To that end we intend to develop fast machine-learning algorithms to quickly extract the connectivity data from the MRI data as well as real-time visualizations depicting both types of connectivity data. We additionally look into the effectiveness of machine-learning algorithms learning from screenshots of the visualizations instead of the large MRI data.
To attain our goals, we wish to employ a GPU server for GPU-based machine-learning algorithms and GPU-based visualizations, streaming the results to light-weight devices of neuroscientists and neurosurgeons.
(INRIA - team AVIZ)
The Matrix Element Method and GPUs for the Higgs boson studies10m
The Matrix Element Method (MEM) is a powerful approach in particle physics to extract maximal information of the events arising from the LHC pp collisions and is currently being deployed in the Higgs->tautau Vector Boson Fusion channel. Compared to other methods requiring trainings, the MEM allows direct comparisons between a theory and the observation. Since this method implies an integration over a large number of variables, the MEM is much more CPU time consuming at the analysis level than classic methods. As a consequence, this method is hardly exploitable with sequential implementation, in particular, with large background samples. For the upcoming LHC data-taking, this issue will become even more crucial.
Fortunately enough, the Monte Carlo integration in MEM is very well suited to GPU computing, and the expected significant gain in processing time will be an asset for our analysis and to generalize the use of MEM in LHC analyses.
Fast Distributed Total Variation10m
The total variation is used in many applications including imaging, signal processing and machine learning. We developed a distributed algorithm to compute the so-called proximal operator of this regularization. In this talk, I will give some insights on the distribution scheme used to implement the underlying CUDA code. Some applications will be covered, such as 2D/3D denoising and 2D inpainting.
This is a joint work with A. Chambolle (CMAP, École Polytechnique) and T. Pock (ICG, TU Graz).
Scientific projects: Presentations - Session 2
Recognition and information extraction in multi-lingual documents with Recurrent Neural Networks and Deep Neural Networks10m
Handwriting recognition is a classical AI problem, which has been studied for around 50 years; in its most recent variant, it deals with the recognition of handwritten lines of text. Beyond its inherent importance, handwriting recognition has also served as a testbed for the introduction of some widely used machine learning algorithms, such as the convolutional neural network (CNN) and the long short term memory (LSTM) recurrent neural network (RNN).
Our research focuses on combining the use of deep and recurrent neural networks, as well as improving the architectures and learning algorithms. GPUs are very well suited to these models and GPU access would allow us to go much faster from ideas to experiments and conclusions.
Uses of GPUs for challenges in Machine Learning10m
We have been organizing in the recent years a number of machine learning challenges with datasets of increasingly large sizes. Particularly demanding are the computer vision and medical imaging tasks. As challenges in machine learning move into the era of big data, it becomes less and less realistic to move data around to let participants enter challenges. Rather, we promote the use of platforms such as Codalab, which offer the possibility of submitting code readily executed on the platform where the data reside.
We will present strategies for using GPUs to boost computational performances in this framework.
Locating Influential Nodes in Complex Networks: A Truss Decomposition Approach10m
Understanding and controlling spreading dynamics in networks assumes identification of the most influential nodes that will trigger efficient information diffusion. It has been shown that the best spreaders are the ones located in the k-core of the network rather than those with the highest degree or centrality [Kitsak et al., Nature Physics 6, 888–893 (2010)]. In this paper, we further refine the set of the most influential nodes, showing that the nodes belonging to the best K-truss subgraph, as identified by the K-truss decomposition of the network, perform even better. K-truss, being a subset of the k-core of the network, contributes in the reduction of the set of privileged spreaders for information diffusion. We are comparing spreaders belonging to K-truss to those belonging to the rest of the k-core subgraph and those having the highest degrees in the network. Using the SIR epidemic model, we show that such spreaders will influence a greater part of the network during the first steps of the process, but will also cover a larger portion of it at the end of the epidemic – which on average stops at an earlier time step in our case. Furthermore we are studying the robustness of those influential nodes under graph perturbations to examine how they are affected after using various graph noise models. We are finally investigating the impact at the information diffusion associated with multiple initial spreaders which are located in different communities of the networks.