Paper

The Curse of Highly Variable Functions for Local Kernel Machines

We present a series of theoretical arguments supporting the claim that a large class of modern learning algorithms that rely solely on the smooth-ness prior – with similarity between examples expressed with a local kernel – are sensitive to the curse of dimensionality, or more precisely to the variability of the target. Our discussion covers supervised, semi-supervised and unsupervised learning algorithms. These algorithms are found to be local in the sense that crucial properties of the learned func-tion at x depend mostly on the neighbors of x in the training set. This makes them sensitive to the curse of dimensionality, well studied for classical non-parametric statistical learning. We show in the case of the Gaussian kernel that when the function to be learned has many variations, these algorithms require a number of training examples proportional to the number of variations, which could be large even though there may ex-ist short descriptions of the target function, i.e. their Kolmogorov com-plexity may be low. This suggests that there exist non-local learning algorithms that at least have the potential to learn about such structured but apparently complex functions (because locally they have many vari-ations), while not using very specific prior domain knowledge. 1

http://www.iro.umontreal.ca/~lisa/pointeurs/local_failure-nips-submission.pdfPublished 2005-12-05Paper link

Authors: Yoshua Bengio · Olivier Delalleau · Nicolas Le Roux

Topics

Relevant entities

People

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.