A Note on Lazy Training in Supervised Differentiable Programming

Lenaic Chizat; Francis Bach

Pré-Publication, Document De Travail Année : 2019

A Note on Lazy Training in Supervised Differentiable Programming

(1) , (2, 1)

1
2

Lenaic Chizat

Fonction : Auteur
PersonId : 19586
IdHAL : lenaic-chizat
ORCID : 0000-0002-6553-1211
IdRef : 23033802X

Statistical Machine Learning and Parsimony

Francis Bach

Fonction : Auteur
PersonId : 863086

Laboratoire d'informatique de l'école normale supérieure

Statistical Machine Learning and Parsimony

Résumé

In a series of recent theoretical works, it has been shown that strongly over-parameterized neural networks trained with gradient-based methods could converge linearly to zero training loss, with their parameters hardly varying. In this note, our goal is to exhibit the simple structure that is behind these results. In a simplified setting, we prove that "lazy training" essentially solves a kernel regression. We also show that this behavior is not so much due to over-parameterization than to a choice of scaling, often implicit, that allows to linearize the model around its initialization. These theoretical results complemented with simple numerical experiments make it seem unlikely that "lazy training" is behind the many successes of neural networks in high dimensional tasks.

Mots clés

Gradient flow Kernel regression Neural network optimization

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

chizatbach2018lazy.pdf (760.06 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Lénaïc Chizat : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01945578

Soumis le : jeudi 21 février 2019-07:56:20

Dernière modification le : mardi 3 octobre 2023-17:18:04

Archivage à long terme le : mercredi 22 mai 2019-13:10:46

Dates et versions

hal-01945578 , version 1 (05-12-2018)

hal-01945578 , version 2 (11-12-2018)

hal-01945578 , version 3 (21-02-2019)

hal-01945578 , version 4 (08-06-2019)

hal-01945578 , version 5 (18-06-2019)

hal-01945578 , version 6 (07-01-2020)

Identifiants

HAL Id : hal-01945578 , version 3
ARXIV : 1812.07956

Citer

Lenaic Chizat, Francis Bach. A Note on Lazy Training in Supervised Differentiable Programming. 2019. ⟨hal-01945578v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

5427 Consultations

4449 Téléchargements

A Note on Lazy Training in Supervised Differentiable Programming

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager