Probabilistic graphical models and deep learning methods for remote sensing image analysis

Pastorino, Martina

doi:10.15167/pastorino-martina_phd2023-12-11

Given the current advances in space missions for Earth observation, it is possible to have access to very-high-resolution and multimodal satellite imagery. The data acquired can be optical (e.g., panchromatic, multispectral, and hyperspectral images) or radar, with different bands and various trade-offs between resolution and coverage. This offers great application potential in the field of remote sensing. An important role in this context is played by semantic segmentation whose purpose is to assign each pixel in an image to a semantic class, typically related to land cover or land use and with prominent applications in areas such as urban planning, precision agriculture, monitoring of forest species, natural disaster management, and climate change monitoring and mitigation. The present thesis focuses on the development of novel methods for the analysis of multimodal data aimed at fully exploiting all the available multisource, multisensor, multiresolution, etc., information, in the field of semantic segmentation of remote sensing imagery. These methods combine ideas from stochastic models and deep learning. On the one hand, deep learning is currently the dominant approach to image classification and segmentation. Thanks to the non-parametric formulation and the intrinsically multiscale processing stages that characterize convolutional neural networks, deep learning architectures can be effectively employed for multimodal image fusion and analysis. However, the performances of deep learning methods are remarkably influenced by the quantity and quality of the ground truth used for training. On the other hand, probabilistic graphical models have sparked major interest in the past few years, because of the ever-growing need for structured predictions. Depending on the underlying graph topology over which they are defined, they can effectively model spatial and multiresolution information. The main goal of the thesis is to develop approaches leveraging the advantages of these two major methodological families -- per se and through their integration -- for the exploitation of multimodal remote sensing data and of the complementary information they convey. In this context, first, a novel framework integrating causal hierarchical probabilistic graphical modeling, fully convolutional networks and, in an earlier formulation, decision tree ensembles, is proposed. The extension of the proposed multiresolution framework to burnt forest area mapping from data with very large resolution ratio is also investigated. Then, a deep learning method is developed to directly learn stochastic models, such as conditional random fields, from the input image data. Finally, the potentiality of the multimodal fusion of remote sensing imagery with mobility demand data was exploited to propose a stochastic region-based model for land-use mapping in urban areas. The theoretical framework of the developed methods is described in details. The experimental validations, conducted with multimodal multispectral, panchromatic, and radar satellite images, suggest the effectiveness of the proposed methods. The proposed approaches are also compared to recent state of the art methods developed for similar semantic segmentation applications.