3620 South Vermont Avenue, Los Angeles, CA 90089

View map


Elliot Paquette, McGill University


Title: Random matrix theory for high dimensional optimization, and an application to scaling laws

 
Abstract:  We describe a program of analysis of stochastic gradient methods on high-dimensional random objectives.  We illustrate some assumptions under which the loss curves are universal, in that they can completely be described in terms of some underlying covariance structure of the problem setup.   Furthermore, we give a description of these loss curves that can be analyzed precisely.
 
As a motivating application, we show how this can be applied to the power-law-random-features model.  This is a simple two-hyperparameter family of optimization problems, which displays 5 distinct phases of SGD loss curves; these phases are determined by the relative complexities of the target, data distribution, and whether these are ‘high-dimensional’ or not (which in context can be precisely defined).  In each phase, we can also give, for a given compute budget, the optimal random-feature dimensionality.
 
Joint work with Courtney Paquette (McGill & Google Deepmind), Jeffrey Pennington (Google Deepmind), and Lechao Xiao  (Google Deepmind).

Event Details