Search Results for author: Clayton Sanford

Found 9 papers, 4 papers with code

Transformers, parallel computation, and logarithmic depth

1 code implementation14 Feb 2024 Clayton Sanford, Daniel Hsu, Matus Telgarsky

We show that a constant number of self-attention layers can efficiently simulate, and be simulated by, a constant number of communication rounds of Massively Parallel Computation.

Learning Single-Index Models with Shallow Neural Networks

no code implementations27 Oct 2022 Alberto Bietti, Joan Bruna, Clayton Sanford, Min Jae Song

Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input.

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

1 code implementation11 Oct 2022 Vaggos Chatziafratis, Ioannis Panageas, Clayton Sanford, Stelios Andrew Stavroulakis

Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train.

Intrinsic dimensionality and generalization properties of the $\mathcal{R}$-norm inductive bias

1 code implementation10 Jun 2022 Navid Ardeshir, Daniel Hsu, Clayton Sanford

We study the structural and statistical properties of $\mathcal{R}$-norm minimizing interpolants of datasets labeled by specific target functions.

Inductive Bias

Near-Optimal Statistical Query Lower Bounds for Agnostically Learning Intersections of Halfspaces with Gaussian Marginals

no code implementations10 Feb 2022 Daniel Hsu, Clayton Sanford, Rocco Servedio, Emmanouil-Vasileios Vlatakis-Gkaragkounis

This lower bound is essentially best possible since an SQ algorithm of Klivans et al. (2008) agnostically learns this class to any constant excess error using $n^{O(\log k)}$ queries of tolerance $n^{-O(\log k)}$.

Expressivity of Neural Networks via Chaotic Itineraries beyond Sharkovsky's Theorem

no code implementations19 Oct 2021 Clayton Sanford, Vaggos Chatziafratis

Given a target function $f$, how large must a neural network be in order to approximate $f$?

Support vector machines and linear regression coincide with very high-dimensional features

1 code implementation NeurIPS 2021 Navid Ardeshir, Clayton Sanford, Daniel Hsu

The support vector machine (SVM) and minimum Euclidean norm least squares regression are two fundamentally different approaches to fitting linear models, but they have recently been connected in models for very high-dimensional data through a phenomenon of support vector proliferation, where every training example used to fit an SVM becomes a support vector.

regression

On the Approximation Power of Two-Layer Networks of Random ReLUs

no code implementations3 Feb 2021 Daniel Hsu, Clayton Sanford, Rocco A. Servedio, Emmanouil-Vasileios Vlatakis-Gkaragkounis

This paper considers the following question: how well can depth-two ReLU networks with randomly initialized bottom-level weights represent smooth functions?

Vocal Bursts Valence Prediction

Uniform Convergence Bounds for Codec Selection

no code implementations18 Dec 2018 Clayton Sanford, Cyrus Cousins, Eli Upfal

We frame the problem of selecting an optimal audio encoding scheme as a supervised learning task.

Selection bias

Cannot find the paper you are looking for? You can Submit a new open access paper.