Efficient machine learning with resources constraints

Alfano, PAOLO DIDIER

doi:10.15167/alfano-paolo-didier_phd2023-05-30

Since machine learning techniques spread in the scientific community and in real-world scenarios, their usage has been justified by the impossibility of traditional techniques to deal with simple problems that require the retrieval of specific task-related information. In the beginning, neural networks were made of a very reduced amount of layers, with a limited capacity to solve complicated problems. However, in the last years, the set of methodologies we usually refer to as \textit{deep learning} became the de-facto standard in a large variety of fields. Their astonishing ability to solve different kinds of problems has been proven, from very simple and specific tasks to more general problems, such as image recognition, object detection, video recognition, and natural language processing. In the last two years a new approach, referred to as transformers, has been proposed showing state-of-the-art performances in similar contexts to the ones covered by convolutional neural networks. The huge improvement in performances obtained by recent models came at a cost from different points of view. The number of learnable parameters involved moved from tens of millions to hundreds of billions in less than ten years coupled with an increase from a few hundred to millions of PFLOPS needed to train better models in terms of performance. Overall, the amount of energy needed to train the more recent architectures increased drastically in the last few years showing a problematic situation in terms of resources needed to obtain the next state-of-the-art performance. In this thesis, we will see different methodologies to alleviate the computational costs of some typical machine learning problems. First, we will focus on image classification, considering a simple transfer learning approach that exploits pre-trained convolutional features as input for a fast kernel method. By performing more than three thousand training processes, we will show that this fast-kernel approach provides comparable accuracy w.r.t. fine-tuning, with a training time that is between one and two orders of magnitude smaller. Then we will introduce and discuss an unsupervised pipeline that projects input images to a latent space with reduced dimension, making the clustering operation doable. We will show the pipeline effectiveness in a plankton monitoring context where operating in an unsupervised manner is crucial. Indeed, studying plankton population in situ is paramount to protect marine ecosystems as they can be regarded as biosensors. Lastly, we will discuss different methodologies to compare two or more image datasets. Indeed, each dataset can be seen as a set of points sampled by an unknown distribution that we can estimate and analyze. We will introduce different methodologies to study such distributions. We will show that, even on simple tasks involving images, the concept of dataset distance is elusive and very complicated to quantify. It is possible to obtain information on different image datasets, via good partitioning, as long as we analyze a small datasets subset. Overall, in this thesis, we will consider a set of techniques that can alleviate machine learning computational costs, in order to keep them computationally accessible to the scientific community.