Deep Convolutional Networks are Hierarchical Kernel Machines

IRIS

In i-theory a typical layer of a hierarchical architecture consists of HW modules pooling the dot products of the inputs to the layer with the transformations of a few templates under a group. Such layers include as special cases the convolutional layers of Deep Convolutional Networks (DCNs) as well as the non-convolutional layers (when the group contains only the identity). Rectifying nonlinearities – which are used by present-day DCNs – are one of the several nonlinearities admitted by i-theory for the HW module. We discuss here the equivalence between group averages of linear combinations of rectifying nonlinearities and an associated kernel. This property implies that present-day DCNs can be exactly equivalent to a hierarchy of kernel machines with pooling and nonpooling layers. Finally, we describe a conjecture for theoretically understanding hierarchies of such modules. A main consequence of the conjecture is that hierarchies of trained HW modules minimize memory requirements while computing a selective and invariant representation.

Deep Convolutional Networks are Hierarchical Kernel Machines

Fabio Anselmi;Lorenzo Rosasco;Cheston Tan;Tomaso A. Poggio

2015-01-01

Abstract

In i-theory a typical layer of a hierarchical architecture consists of HW modules pooling the dot products of the inputs to the layer with the transformations of a few templates under a group. Such layers include as special cases the convolutional layers of Deep Convolutional Networks (DCNs) as well as the non-convolutional layers (when the group contains only the identity). Rectifying nonlinearities – which are used by present-day DCNs – are one of the several nonlinearities admitted by i-theory for the HW module. We discuss here the equivalence between group averages of linear combinations of rectifying nonlinearities and an associated kernel. This property implies that present-day DCNs can be exactly equivalent to a hierarchy of kernel machines with pooling and nonpooling layers. Finally, we describe a conjecture for theoretically understanding hierarchies of such modules. A main consequence of the conjecture is that hierarchies of trained HW modules minimize memory requirements while computing a selective and invariant representation.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2015

Appare nelle tipologie:

07.12 - Altro

File in questo prodotto:

File	Dimensione	Formato
11567-888543rev5.pdf accesso aperto Descrizione: Articolo principale (working paper) revision 5 Tipologia: Documento in versione editoriale Dimensione 975.65 kB Formato Adobe PDF Visualizza/Apri	975.65 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/888543

Citazioni

ND

ND

ND

social impact