Despite the huge success of machine learning methods in the last decade, a crucial issue is to control the support of the data used in inference, so that data that are too far from the training set are given low confidence by default. The most important class that features this ability is that of prototype-based methods which are based on clustering or vector quantization as a representation learning model. This paper surveys a family of popular soft clustering methods, framing them in a unified formalism. It also discusses the peculiarities of each of them. A large fraction of the paper is devoted to clarifying the role of model parameters and to providing some guidelines on how to set up these parameters.

Soft Clustering: Why and How-To

Rovetta S.;Masulli F.
2019-01-01

Abstract

Despite the huge success of machine learning methods in the last decade, a crucial issue is to control the support of the data used in inference, so that data that are too far from the training set are given low confidence by default. The most important class that features this ability is that of prototype-based methods which are based on clustering or vector quantization as a representation learning model. This paper surveys a family of popular soft clustering methods, framing them in a unified formalism. It also discusses the peculiarities of each of them. A large fraction of the paper is devoted to clarifying the role of model parameters and to providing some guidelines on how to set up these parameters.
2019
978-3-030-12543-1
978-3-030-12544-8
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1058408
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact