Representation or compression of data sets in the wavelet space is usually performed to retain the maximum variance of the original or pretreated data, like in the compression by means of principal components. In order to represent together a number of objects in the wavelet space, a common basis is required, and this common basis is usually obtained by means of the variance spectrum or of the variance wavelet tree. In this study, the use of alternative common bases is suggested, both for classification and regression problems. In the case of classification or class-modeling, the suggested common bases are based on the spectrum of the Fisher weights (a measure of the between-class to within-class variance ratio) or on the spectrum of the SIMCA discriminant weights. In the case of regression, the suggested common bases are obtained by the correlation spectrum (the correlation coefficients of the predictor variables with a response variable) or by the PLS (Partial Least Squares regression) importance of the predictors (the product between the absolute value of the regression coefficient of the predictor in the PLS model and its standard deviation). Other alternative strategies apply the Gram-Schmidt supervised orthogonalization to the wavelet coefficients. The results indicate that, both in classification and regression, the information retained after compression in the wavelets space can be more efficient than that retained with a common basis obtained by variance.

Alternative common bases and signal compression for wavelets application in chemometrics

FORINA, MICHELE;OLIVERI, PAOLO;CASALE, MONICA
2011-01-01

Abstract

Representation or compression of data sets in the wavelet space is usually performed to retain the maximum variance of the original or pretreated data, like in the compression by means of principal components. In order to represent together a number of objects in the wavelet space, a common basis is required, and this common basis is usually obtained by means of the variance spectrum or of the variance wavelet tree. In this study, the use of alternative common bases is suggested, both for classification and regression problems. In the case of classification or class-modeling, the suggested common bases are based on the spectrum of the Fisher weights (a measure of the between-class to within-class variance ratio) or on the spectrum of the SIMCA discriminant weights. In the case of regression, the suggested common bases are obtained by the correlation spectrum (the correlation coefficients of the predictor variables with a response variable) or by the PLS (Partial Least Squares regression) importance of the predictors (the product between the absolute value of the regression coefficient of the predictor in the PLS model and its standard deviation). Other alternative strategies apply the Gram-Schmidt supervised orthogonalization to the wavelet coefficients. The results indicate that, both in classification and regression, the information retained after compression in the wavelets space can be more efficient than that retained with a common basis obtained by variance.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/590947
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 3
social impact