Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.

Improving maps from cNNs trained with sparse, scribbled ground truths using fully connected CRFs

MAGGIOLO, LUCA;Moser G.;Tuia D.
2018-01-01

Abstract

Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.
2018
978-1-5386-7150-4
File in questo prodotto:
File Dimensione Formato  
18.igarss.luca.pdf

accesso chiuso

Tipologia: Documento in versione editoriale
Dimensione 4.01 MB
Formato Adobe PDF
4.01 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/957452
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 10
social impact