Search Results for author: Max Simchowitz

Found 47 papers, 6 papers with code

Logarithmic Regret for Online Control with Adversarial Noise

no code implementations • ICML 2020 • Dylan Foster, Max Simchowitz

We consider the problem of online control in a known linear dynamical system subject to adversarial noise.

LEMMA

Paper
Add Code

Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

no code implementations • 17 Oct 2023 • Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang

This work studies training instabilities of behavior cloning with deep neural networks.

Continuous Control Text Generation

Paper
Add Code

Robot Fleet Learning via Policy Merging

1 code implementation • 2 Oct 2023 • Lirui Wang, Kaiqing Zhang, Allan Zhou, Max Simchowitz, Russ Tedrake

We show that FLEET-MERGE consolidates the behavior of policies trained on 50 tasks in the Meta-World environment, with good performance on nearly all training tasks at test time.

Robot Manipulation

Paper
Code

Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

no code implementations • 12 Jul 2023 • Max Simchowitz, Abhishek Gupta, Kaiqing Zhang

Focusing on the special case where the labels are given by bilinear embeddings into a Hilbert space $H$: $\mathbb{E}[z \mid x, y ]=\langle f_{\star}(x), g_{\star}(y)\rangle_{{H}}$, we aim to extrapolate to a test distribution domain that is $not$ covered in training, i. e., achieving bilinear combinatorial extrapolation.

Matrix Completion

Paper
Add Code

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

no code implementations • 16 May 2023 • Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu

A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e. g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost.

Paper
Add Code

Learning to Extrapolate: A Transductive Approach

1 code implementation • 27 Apr 2023 • Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit Agrawal

Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data.

Imitation Learning

Paper
Code

Statistical Learning under Heterogeneous Distribution Shift

no code implementations • 27 Feb 2023 • Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

We show that, when the class $F$ is "simpler" than $G$ (measured, e. g., in terms of its metric entropy), our predictor is more resilient to heterogeneous covariate shifts} in which the shift in $\mathbf{x}$ is much greater than that in $\mathbf{y}$.

Paper
Add Code

Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making

no code implementations • 10 Feb 2023 • Adam Block, Alexander Rakhlin, Max Simchowitz

Smoothed online learning has emerged as a popular framework to mitigate the substantial loss in statistical and computational complexity that arises when one moves from classical to adversarial learning.

Decision Making Econometrics

Paper
Add Code

Smoothed Online Learning for Prediction in Piecewise Affine Systems

no code implementations • NeurIPS 2023 • Adam Block, Max Simchowitz, Russ Tedrake

The problem of piecewise affine (PWA) regression and planning is of foundational importance to the study of online learning, control, and robotics, where it provides a theoretically and empirically tractable setting to study systems undergoing sharp changes in the dynamics.

Paper
Add Code

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

no code implementations • 25 May 2022 • Adam Block, Max Simchowitz

Due to the drastic gap in complexity between sequential and batch statistical learning, recent work has studied a smoothed sequential learning setting, where Nature is constrained to select contexts with density bounded by 1/{\sigma} with respect to a known measure {\mu}.

Paper
Add Code

Globally Convergent Policy Search over Dynamic Filters for Output Estimation

no code implementations • 23 Feb 2022 • Jack Umenberger, Max Simchowitz, Juan C. Perdomo, Kaiqing Zhang, Russ Tedrake

In this paper, we provide a new perspective on this challenging problem based on the notion of $\textit{informativity}$, which intuitively requires that all components of a filter's internal state are representative of the true state of the underlying dynamical system.

Paper
Add Code

Online Control of Unknown Time-Varying Dynamical Systems

no code implementations • NeurIPS 2021 • Edgar Minasyan, Paula Gradu, Max Simchowitz, Elad Hazan

On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term.

Paper
Add Code

Do Differentiable Simulators Give Better Policy Gradients?

no code implementations • 2 Feb 2022 • H. J. Terry Suh, Max Simchowitz, Kaiqing Zhang, Russ Tedrake

Differentiable simulators promise faster computation time for reinforcement learning by replacing zeroth-order gradient estimates of a stochastic objective with an estimate based on first-order gradients.

Paper
Add Code

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

no code implementations • 26 Jan 2022 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.

Reinforcement Learning (RL)

Paper
Add Code

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

no code implementations • 7 Dec 2021 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

Decision Making reinforcement-learning +1

Paper
Add Code

Stabilizing Dynamical Systems via Policy Gradient Methods

no code implementations • NeurIPS 2021 • Juan C. Perdomo, Jack Umenberger, Max Simchowitz

Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering.

Policy Gradient Methods

Paper
Add Code

Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

no code implementations • 5 Aug 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations • NeurIPS 2021 • Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +2

Paper
Add Code

On the Stability of Nonlinear Receding Horizon Control: A Geometric Perspective

no code implementations • 27 Mar 2021 • Tyler Westenbroek, Max Simchowitz, Michael I. Jordan, S. Shankar Sastry

Crucially, this guarantee requires that state costs applied to the planning problems are in a certain sense `compatible' with the global geometry of the system, and a simple counter-example demonstrates the necessity of this condition.

Paper
Add Code

Towards a Dimension-Free Understanding of Adaptive Linear Control

no code implementations • 19 Mar 2021 • Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

Paper
Add Code

Exploration and Incentives in Reinforcement Learning

no code implementations • 28 Feb 2021 • Max Simchowitz, Aleksandrs Slivkins

How do you incentivize self-interested agents to $\textit{explore}$ when they prefer to $\textit{exploit}$?

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Task-Optimal Exploration in Linear Dynamical Systems

no code implementations • 10 Feb 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

Decision Making

Paper
Add Code

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control Decoder

Paper
Add Code

Making Non-Stochastic Control (Almost) as Easy as Stochastic

no code implementations • NeurIPS 2020 • Max Simchowitz

Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem in which a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i. i. d.

Paper
Add Code

Constrained episodic reinforcement learning in concave-convex and knapsack settings

1 code implementation • NeurIPS 2020 • Kianté Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We propose an algorithm for tabular episodic reinforcement learning with constraints.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning

1 code implementation • ICML 2020 • Esther Rolf, Max Simchowitz, Sarah Dean, Lydia T. Liu, Daniel Björkegren, Moritz Hardt, Joshua Blumenstock

Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies.

BIG-bench Machine Learning Fairness

Paper
Code

Logarithmic Regret for Adversarial Online Control

no code implementations • 29 Feb 2020 • Dylan J. Foster, Max Simchowitz

We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances.

Paper
Add Code

Reward-Free Exploration for Reinforcement Learning

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Naive Exploration is Optimal for Online LQR

no code implementations • ICML 2020 • Max Simchowitz, Dylan J. Foster

Our upper bound is attained by a simple variant of $\textit{{certainty equivalent control}}$, where the learner selects control inputs according to the optimal controller for their estimate of the system while injecting exploratory random noise.

Paper
Add Code

Improper Learning for Non-Stochastic Control

no code implementations • 25 Jan 2020 • Max Simchowitz, Karan Singh, Elad Hazan

We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control.

Paper
Add Code

Corruption-robust exploration in episodic reinforcement learning

no code implementations • 20 Nov 2019 • Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We initiate the study of multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of stochastic bandits.

Multi-Armed Bandits reinforcement-learning +1

Paper
Add Code

The gradient complexity of linear regression

no code implementations • 6 Nov 2019 • Mark Braverman, Elad Hazan, Max Simchowitz, Blake Woodworth

We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle.

regression

Paper
Add Code

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

no code implementations • NeurIPS 2019 • Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

Paper
Add Code

Learning Linear Dynamical Systems with Semi-Parametric Least Squares

1 code implementation • 2 Feb 2019 • Max Simchowitz, Ross Boczar, Benjamin Recht

We analyze a simple prefiltered variation of the least squares estimator for the problem of estimation with biased, semi-parametric noise, an error model studied more broadly in causal statistics and active learning.

Active Learning

Paper
Code

A Successive-Elimination Approach to Adaptive Robotic Sensing

no code implementations • 27 Sep 2018 • Esther Rolf, David Fridovich-Keil, Max Simchowitz, Benjamin Recht, Claire Tomlin

We study an adaptive source seeking problem, in which a mobile robot must identify the strongest emitter(s) of a signal in an environment with background emissions.

Trajectory Planning

Paper
Add Code

The implicit fairness criterion of unconstrained learning

no code implementations • 29 Aug 2018 • Lydia T. Liu, Max Simchowitz, Moritz Hardt

We show that under reasonable conditions, the deviation from satisfying group calibration is upper bounded by the excess risk of the learned score relative to the Bayes optimal score function.

BIG-bench Machine Learning Fairness

Paper
Add Code

Adaptive Sampling for Convex Regression

no code implementations • 14 Aug 2018 • Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.

regression

Paper
Add Code

On the Randomized Complexity of Minimizing a Convex Quadratic Function

no code implementations • 24 Jul 2018 • Max Simchowitz

Minimizing a convex, quadratic objective of the form $f_{\mathbf{A},\mathbf{b}}(x) := \frac{1}{2}x^\top \mathbf{A} x - \langle \mathbf{b}, x \rangle$ for $\mathbf{A} \succ 0 $ is a fundamental problem in machine learning and optimization.

Paper
Add Code

Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law

no code implementations • 4 Apr 2018 • Max Simchowitz, Ahmed El Alaoui, Benjamin Recht

We show that for every $\mathtt{gap} \in (0, 1/2]$, there exists a distribution over matrices $\mathbf{M}$ for which 1) $\mathrm{gap}_r(\mathbf{M}) = \Omega(\mathtt{gap})$ (where $\mathrm{gap}_r(\mathbf{M})$ is the normalized gap between the $r$ and $r+1$-st largest-magnitude eigenvector of $\mathbf{M}$), and 2) any algorithm $\mathsf{Alg}$ which takes fewer than $\mathrm{const} \times \frac{r \log d}{\sqrt{\mathtt{gap}}}$ queries fails (with overwhelming probability) to identity a matrix $\widehat{\mathsf{V}} \in \mathbb{R}^{d \times r}$ with orthonormal columns for which $\langle \widehat{\mathsf{V}}, \mathbf{M} \widehat{\mathsf{V}}\rangle \ge (1 - \mathrm{const} \times \mathtt{gap})\sum_{i=1}^r \lambda_i(\mathbf{M})$.

Paper
Add Code

Delayed Impact of Fair Machine Learning

3 code implementations • ICML 2018 • Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time.

BIG-bench Machine Learning Fairness

410

Paper
Code

Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

no code implementations • 22 Feb 2018 • Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

Time Series Time Series Analysis

Paper
Add Code

Approximate Ranking from Pairwise Comparisons

no code implementations • 4 Jan 2018 • Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, Martin J. Wainwright

Accordingly, we study the problem of finding approximate rankings from pairwise comparisons.

Paper
Add Code

First-order Methods Almost Always Avoid Saddle Points

no code implementations • 20 Oct 2017 • Jason D. Lee, Ioannis Panageas, Georgios Piliouras, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We establish that first-order methods avoid saddle points for almost all initializations.

Paper
Add Code

On the Gap Between Strict-Saddles and True Convexity: An Omega(log d) Lower Bound for Eigenvector Approximation

no code implementations • 14 Apr 2017 • Max Simchowitz, Ahmed El Alaoui, Benjamin Recht

We prove a \emph{query complexity} lower bound on rank-one principal component analysis (PCA).

Paper
Add Code

The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime

no code implementations • 16 Feb 2017 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

Paper
Add Code

Best-of-K Bandits

no code implementations • 9 Mar 2016 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

Paper
Add Code

Gradient Descent Converges to Minimizers

no code implementations • 16 Feb 2016 • Jason D. Lee, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We show that gradient descent converges to a local minimizer, almost surely with random initialization.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.