Probability density estimation (PDF) is a task of primary importance in many contexts, including Bayesian learning and novelty detection. Despite the wide variety of methods at disposal to estimate PDF, only a few of them are widely used in practice by data analysts. Among the most used methods are the histograms, Parzen windows, vector quantization based Parzen, and finite Gaussian mixtures. This paper compares these estimations methods from a practical point of view, i.e. when the user is faced to various requirements from the applications. In particular it addresses the question of which method to use when the learning sample is large or small, and of the computational complexity resulting from the choice (by cross-validation methods) of external parameters such as the number of kernels and their widths in kernel mixture models, the robustness to initial conditions, etc. Expected behaviour of the estimation algorithms is drawn from an algorithmic perspective; numerical experiments are used to illustrate these results.

A Comparative Study of Various Probability Density Estimation Methods for Data Analysis

VALLE, MAURIZIO;
2008-01-01

Abstract

Probability density estimation (PDF) is a task of primary importance in many contexts, including Bayesian learning and novelty detection. Despite the wide variety of methods at disposal to estimate PDF, only a few of them are widely used in practice by data analysts. Among the most used methods are the histograms, Parzen windows, vector quantization based Parzen, and finite Gaussian mixtures. This paper compares these estimations methods from a practical point of view, i.e. when the user is faced to various requirements from the applications. In particular it addresses the question of which method to use when the learning sample is large or small, and of the computational complexity resulting from the choice (by cross-validation methods) of external parameters such as the number of kernels and their widths in kernel mixture models, the robustness to initial conditions, etc. Expected behaviour of the estimation algorithms is drawn from an algorithmic perspective; numerical experiments are used to illustrate these results.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/220773
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 6
social impact