Class-modeling techniques, classic and recent, are studied with special reference with the new applications to data sets characterized by many variables, frequently noisy variables without importance in the characterization of the studied class. UNEQ (based on the hypothesis of multivariate normal distribution and on the Hotelling T2 statistics), SIMCA (with a model built on the class principal components), POTFUN (Potential Functions Modeling, where the probability distribution is estimated by means of the potential functions), MRM (Multivariate Range Modeling, where the model is obtained with the range of the original variables and of discriminant functions) are compared by means of the sensitivities and specificities of the models evaluated both by means of cross validation and with the model forced to accept all the objects of the modeled category. The parameters used to evaluate the performance of class-modeling techniques are critically reviewed. The performances of class-modeling techniques, both in classification and in modeling, have been evaluated on real data sets, with the original variables and on subsets of variables obtained after elimination of nondiscriminant variables. The effect of noisy variables and of deviation from the underlying hypotheses are discussed.

Class modeling techniques, classic and new, for old and new problems

Forina, Michele;Oliveri, Paolo;Lanteri, Silvia;Casale, Monica
2008

Abstract

Class-modeling techniques, classic and recent, are studied with special reference with the new applications to data sets characterized by many variables, frequently noisy variables without importance in the characterization of the studied class. UNEQ (based on the hypothesis of multivariate normal distribution and on the Hotelling T2 statistics), SIMCA (with a model built on the class principal components), POTFUN (Potential Functions Modeling, where the probability distribution is estimated by means of the potential functions), MRM (Multivariate Range Modeling, where the model is obtained with the range of the original variables and of discriminant functions) are compared by means of the sensitivities and specificities of the models evaluated both by means of cross validation and with the model forced to accept all the objects of the modeled category. The parameters used to evaluate the performance of class-modeling techniques are critically reviewed. The performances of class-modeling techniques, both in classification and in modeling, have been evaluated on real data sets, with the original variables and on subsets of variables obtained after elimination of nondiscriminant variables. The effect of noisy variables and of deviation from the underlying hypotheses are discussed.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11567/246233
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 138
  • ???jsp.display-item.citation.isi??? 137
social impact