Complexity of Data Subsets Generated by the Random Subspace Method: an Experimental Investigation

IRIS

We report the results from an experimental investigation on the complexity of data subsets generated by the Random Subspace method. The main aim of this study is to analyse the variability of the complexity among the generated subsets. Four measures of complexity have been used, three from [4]: the minimal spanning tree (MST), the adherence subsets measure (ADH), the maximal feature efficiency (MFE); and a cluster label consistency measure (CLC) proposed in [7]. Our results with the UCI “wine” data set relate the variability in data complexity to the number of features used and the presence of redundant features.

Complexity of Data Subsets Generated by the Random Subspace Method: an Experimental Investigation

KUNCHEVA L. I;ROLI, FABIO;MARCIALIS, GIAN LUCA;SHIPP C. A.

2001-01-01

Abstract

We report the results from an experimental investigation on the complexity of data subsets generated by the Random Subspace method. The main aim of this study is to analyse the variability of the complexity among the generated subsets. Four measures of complexity have been used, three from [4]: the minimal spanning tree (MST), the adherence subsets measure (ADH), the maximal feature efficiency (MFE); and a cluster label consistency measure (CLC) proposed in [7]. Our results with the UCI “wine” data set relate the variability in data complexity to the number of features used and the presence of redundant features.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2001
		
	ISBN
	
			978-3-540-42284-6
		
	Appare nelle tipologie:
	
			04.01 - Contributo in atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1086103

Citazioni

ND

11

ND

social impact