The search for similarities in large data sets has a very important role in many scientific fields. It permits to classify several types of data without an explicit information about it. In many cases researchers use analysis methodologies such as clustering to classify data with respect to the patterns and conditions together. But in the last few years new analysis tool such as a biclustering were proposed and applied to the many specific problems. Biclustering algorithms permit not only to classify data with respect to selected conditions, but also to find the conditions that permit to classify data with a better precision. Recently we proposed a biclustering technique based on the Possibilistic Clustering paradigm (PBC algorithm) [1] that is able to find one bicluster at a time. In this paper we propose an improvement to the Possibilistic Biclustering algorithm (PBC Bagging) that permits to find find several biclusters by using the statistical method of Bootstrap aggregation. We applied the algorithm to a synthetic data and to the Yeast dataset, obtaining fast convergence and good quality solutions. A comparison with original PBC method is also presented.

Biclustering by resampling

MASULLI, FRANCESCO;ROVETTA, STEFANO
2010-01-01

Abstract

The search for similarities in large data sets has a very important role in many scientific fields. It permits to classify several types of data without an explicit information about it. In many cases researchers use analysis methodologies such as clustering to classify data with respect to the patterns and conditions together. But in the last few years new analysis tool such as a biclustering were proposed and applied to the many specific problems. Biclustering algorithms permit not only to classify data with respect to selected conditions, but also to find the conditions that permit to classify data with a better precision. Recently we proposed a biclustering technique based on the Possibilistic Clustering paradigm (PBC algorithm) [1] that is able to find one bicluster at a time. In this paper we propose an improvement to the Possibilistic Biclustering algorithm (PBC Bagging) that permits to find find several biclusters by using the statistical method of Bootstrap aggregation. We applied the algorithm to a synthetic data and to the Yeast dataset, obtaining fast convergence and good quality solutions. A comparison with original PBC method is also presented.
2010
9788895272870
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/259633
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
social impact