Unlabeled Patterns to Tighten Rademacher Complexity Error Bounds for Kernel Classifiers

Oneto, Luca; Ghio, Alessandro; Anguita, Davide; Ridella, Sandro

doi:10.1016/j.patrec.2013.04.027

We derive in this work new upper bounds for estimating the generalization error of kernel classifiers, that is the misclassification rate that the models will perform on new and previously unseen data. Though this paper is more targeted towards the error estimation topic, the generalization error can be obviously exploited, in practice, for model selection purposes as well. The derived bounds are based on Rademacher complexity and result to be particularly useful when a set of unlabeled samples are available, in addition to the (labeled) training examples: we will show that, by exploiting further unlabeled patterns, the confidence term of the conventional Rademacher complexity bound can be reduced by a factor of three. Moreover, the availability of unlabeled examples allows also to obtain further improvements by building localized versions of the hypothesis class containing the optimal classifier.