About the lab
We are the lab of Michael DeWeese at UC Berkeley. Our interests fall into three rough categories:
Nonequilibrium statistical mechanics. Understanding and designing biomolecules and molecular-scale machines will ultimately require deep understanding of statistical mechanics far from equilibrium, and recent years have brought major breakthroughs in that direction.
To that end, we're working on work-energy theorems for active matter and optimal control protocols for fluctuating systems, often bringing in techniques from control theory and Riemannian geometry. Lab members in the statistical physics sector include Adam Frim and Adrianne Zhong.
Machine learning theory. Deep neural networks have enabled technological wonders from machine translation to image generation, but, remarkably, nobody has a principled understanding of how they work and what they can do. To fill this gap, we're working on developing first-principles theoretical understanding of neural nets and related machine learning methods, often finding use for tools and concepts from statistical physics. We're also interested in sampling algorithms. Lab members in the ML contingent include Jamie Simon and Dhruva Karkada.
Systems neuroscience. Despite the wealth of neural data acquired in recent years, our real understanding of how the brain works remains rudimentary. To make progress towards this understanding, we develop biologically-plausible algorithms to model sensory processing and other forms of computation. Our theories typically rely on coding principles, such as maximizing sparseness or information flow, similar to familiar concepts from physics like minimizing free energy or maximizing entropy. Our models are designed to clarify the computational roles of different neural populations and to provide specific, falsifiable experimental predictions about the structure and activity patterns in biological neural networks. Lab members in the neuroscience regiment include Andrew Ligeralde and Eduardo Sandoval.
Reverse engineering the neural tangent kernel
A first-principles method for the design of fully-connected architectures
Much of our understanding of artificial neural networks stems from the fact that, in the infinite-width limit, they turn out to be equivalent to a class of simple models called kernel regression. Given a wide network architecture, it's well-known how to find the equivalent kernel method, allowing us to study popular models in the infinite-width limit. We invert this mapping for fully-connected nets (FCNs), allowing one to start from a desired rotation-invariant kernel and design a network (i.e. choose an activation function) to achieve it. Remarkably, achieving any such kernel requires only one hidden layer, raising questions about conventional wisdom on the benefits of depth. This allows surprising experiments, like designing a 1HL FCN that trains and generalizes like a deep ReLU FCN. This ability to design nets with desired kernels is a step towards deriving good net architectures from first principles, a longtime dream of the field of machine learning.