Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.
Improving maps from cNNs trained with sparse, scribbled ground truths using fully connected CRFs
MAGGIOLO, LUCA;Moser G.;Tuia D.
2018-01-01
Abstract
Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.File | Dimensione | Formato | |
---|---|---|---|
18.igarss.luca.pdf
accesso chiuso
Tipologia:
Documento in versione editoriale
Dimensione
4.01 MB
Formato
Adobe PDF
|
4.01 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.