Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models do not offer great flexibility for modelling the dependence of the data since they rely on potential undesirable correlation restrictions and strict assumptions on the marginal distribution. We proposed recently a model-based clustering method via R-vine copula that allows overcoming the previous restrictions by building flexible dependence models for an arbitrary number of variables using bivariate building blocks. This method shows a disappointing behavior in highdimensional spaces since it leads to over-parametrized models. We propose a more parsimonious version of model-based clustering method via R-vine copula to alleviate the computational burden and the risk of overfitting. The model is based on the selection of the hyper-parameters of sparse model classes using truncated and thresholded R-vine copulas. We use simulated and real datasets to illustrate the proposed procedure.

Model-based clustering of high dimensional data using copulas

Nai Ruscone, Marta
2018-01-01

Abstract

Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models do not offer great flexibility for modelling the dependence of the data since they rely on potential undesirable correlation restrictions and strict assumptions on the marginal distribution. We proposed recently a model-based clustering method via R-vine copula that allows overcoming the previous restrictions by building flexible dependence models for an arbitrary number of variables using bivariate building blocks. This method shows a disappointing behavior in highdimensional spaces since it leads to over-parametrized models. We propose a more parsimonious version of model-based clustering method via R-vine copula to alleviate the computational burden and the risk of overfitting. The model is based on the selection of the hyper-parameters of sparse model classes using truncated and thresholded R-vine copulas. We use simulated and real datasets to illustrate the proposed procedure.
2018
978-9963-2227-5-9
File in questo prodotto:
File Dimensione Formato  
6127.pdf

accesso chiuso

Tipologia: Documento in versione editoriale
Dimensione 254.41 kB
Formato Adobe PDF
254.41 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1013532
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact