Orateur
Description
To efficiently solve a big problem by deep learning, it is sometimes useful to decompose it into smaller blocks, enabling us to introduce our knowledge into the model by utilizing an appropriate loss function for each block.
A simple model decomposition, however, causes a performance decrease due to bottlenecks of transferred information induced by the loss definition.
We proposed a method to mitigate such a bottleneck by using hidden features instead of outputs that are defined for the loss function, and experimentally demonstrated the usefulness using a particle physics dataset.
We also demonstrated the adaptive tuning of loss coefficients of each task based on techniques in multi-task learning.