The ability to generalize to unseen data is one of the fundamental, desired properties in a learning system. This thesis reports dierent research eorts in improving the generalization properties of machine learning systems at dierent levels, focusing on neural networks for computer vision tasks. First, a novel regularization method is presented, Curriculum Dropout. It combines Curriculum Learning and Dropout, and shows better regularization eects than the original algorithm in a variety of tasks, without requiring substantially any additional implementation eorts. While regularization methods are extremely powerful to better generalize to unseen data from the same distribution as the training one, they are not very successful in mitigating the dataset bias issue. This problem constitutes in models learning the peculiarities of the training set, and poorly generalizing to unseen domains. Unsupervised domain adaptation has been one of the main solutions to this problem. Two novel adaptation approaches are presented in this thesis. First, we introduce the DIFA algorithm, which combines domain invariance and feature augmentation to better adapt models to new domains by relying on adversarial training. Next, we propose an original procedure that exploits the \mode collapse" behavior of Generative Adversarial Networks. Finally, the general applicability of domain adaptation algorithms is questioned (due to the assumptions of knowing the target distribution a priori and being able to sample from it). A novel framework is presented to overcome its liabilities, where the goal is to generalize to unseen domains by relying only on data from a single source distribution. We face this problem through the lens of robust statistics, dening a worst-case formulation where the model parameters are optimized with respect to populations which are -distant from the source domain on a semantic space.

Regularization, Adaptation and Generalization of Neural Networks

VOLPI, RICCARDO
2019-02-25

Abstract

The ability to generalize to unseen data is one of the fundamental, desired properties in a learning system. This thesis reports dierent research eorts in improving the generalization properties of machine learning systems at dierent levels, focusing on neural networks for computer vision tasks. First, a novel regularization method is presented, Curriculum Dropout. It combines Curriculum Learning and Dropout, and shows better regularization eects than the original algorithm in a variety of tasks, without requiring substantially any additional implementation eorts. While regularization methods are extremely powerful to better generalize to unseen data from the same distribution as the training one, they are not very successful in mitigating the dataset bias issue. This problem constitutes in models learning the peculiarities of the training set, and poorly generalizing to unseen domains. Unsupervised domain adaptation has been one of the main solutions to this problem. Two novel adaptation approaches are presented in this thesis. First, we introduce the DIFA algorithm, which combines domain invariance and feature augmentation to better adapt models to new domains by relying on adversarial training. Next, we propose an original procedure that exploits the \mode collapse" behavior of Generative Adversarial Networks. Finally, the general applicability of domain adaptation algorithms is questioned (due to the assumptions of knowing the target distribution a priori and being able to sample from it). A novel framework is presented to overcome its liabilities, where the goal is to generalize to unseen domains by relying only on data from a single source distribution. We face this problem through the lens of robust statistics, dening a worst-case formulation where the model parameters are optimized with respect to populations which are -distant from the source domain on a semantic space.
25-feb-2019
File in questo prodotto:
File Dimensione Formato  
phd.unige_3552980.pdf

accesso aperto

Tipologia: Tesi di dottorato
Dimensione 18.33 MB
Formato Adobe PDF
18.33 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/940909
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact