no code implementations • 16 Oct 2023 • Zihao Chen, Yeshwanth Cherapanamjeri
We investigate the quantitative performance of affine-equivariant estimators for robust mean estimation.
1 code implementation • 27 May 2023 • Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Briton Park, Tristan Naumann, Anobel Y. Odisho, Bin Yu
These findings showcase the utility of SUFO in enhancing trust and safety when using transformers in medicine, and we believe SUFO can aid practitioners in evaluating fine-tuned language models for other applications in medicine and in more critical domains.
no code implementations • 18 Apr 2023 • Ishaq Aden-Ali, Yeshwanth Cherapanamjeri, Abhishek Shetty, Nikita Zhivotovskiy
In this paper, we address this issue by providing optimal high probability risk bounds through a framework that surpasses the limitations of uniform convergence arguments.
no code implementations • 19 Dec 2022 • Ishaq Aden-Ali, Yeshwanth Cherapanamjeri, Abhishek Shetty, Nikita Zhivotovskiy
In one of the first COLT open problems, Warmuth conjectured that this prediction strategy always implies an optimal high probability bound on the risk, and hence is also an optimal PAC algorithm.
no code implementations • 6 May 2022 • Yeshwanth Cherapanamjeri, Constantinos Daskalakis, Andrew Ilyas, Manolis Zampetakis
In known-index self-selection, the identity of the observed model output is observable; in unknown-index self-selection, it is not.
no code implementations • 4 May 2022 • Yeshwanth Cherapanamjeri, Constantinos Daskalakis, Andrew Ilyas, Manolis Zampetakis
We provide efficient estimation methods for first- and second-price auctions under independent (asymmetric) private values and partial observability.
no code implementations • 3 Mar 2022 • Yeshwanth Cherapanamjeri, Jelani Nelson
We use our inequality to then derive improved guarantees for two applications in the high-dimensional regime: 1) kernel approximation and 2) distance estimation.
no code implementations • 17 Oct 2021 • Yeshwanth Cherapanamjeri, Jelani Nelson
\end{equation*} When $X, Y$ are both Euclidean metrics with $Y$ being $m$-dimensional, recently (Narayanan, Nelson 2019), following work of (Mahabadi, Makarychev, Makarychev, Razenshteyn 2018), showed that distortion $1+\epsilon$ is achievable via such a terminal embedding with $m = O(\epsilon^{-2}\log n)$ for $n := |T|$.
no code implementations • NeurIPS 2021 • Peter L. Bartlett, Sébastien Bubeck, Yeshwanth Cherapanamjeri
We consider the phenomenon of adversarial examples in ReLU networks with independent gaussian parameters.
no code implementations • NeurIPS 2021 • Sébastien Bubeck, Yeshwanth Cherapanamjeri, Gauthier Gidel, Rémi Tachet des Combes
Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks.
no code implementations • 24 Nov 2020 • Yeshwanth Cherapanamjeri, Nilesh Tripuraneni, Peter L. Bartlett, Michael I. Jordan
Concretely, given a sample $\mathbf{X} = \{X_i\}_{i = 1}^n$ from a distribution $\mathcal{D}$ over $\mathbb{R}^d$ with mean $\mu$ which satisfies the following \emph{weak-moment} assumption for some ${\alpha \in [0, 1]}$: \begin{equation*} \forall \|v\| = 1: \mathbb{E}_{X \thicksim \mathcal{D}}[\lvert \langle X - \mu, v\rangle \rvert^{1 + \alpha}] \leq 1, \end{equation*} and given a target failure probability, $\delta$, our goal is to design an estimator which attains the smallest possible confidence interval as a function of $n, d,\delta$.
no code implementations • NeurIPS 2020 • Yeshwanth Cherapanamjeri, Jelani Nelson
Our memory consumption is $\tilde O((n+d)d/\epsilon^2)$, slightly more than the $O(nd)$ required to store $X$ in memory explicitly, but with the benefit that our time to answer queries is only $\tilde O(\epsilon^{-2}(n + d))$, much faster than the naive $\Theta(nd)$ time obtained from a linear scan in the case of $n$ and $d$ very large.
Data Structures and Algorithms
no code implementations • 16 Jul 2020 • Yeshwanth Cherapanamjeri, Efe Aras, Nilesh Tripuraneni, Michael. I. Jordan, Nicolas Flammarion, Peter L. Bartlett
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = \langle X, w^* \rangle + \epsilon$ (with $X \in \mathbb{R}^d$ and $\epsilon$ independent), in which an $\eta$ fraction of the samples have been adversarially corrupted.
no code implementations • 6 Feb 2019 • Yeshwanth Cherapanamjeri, Peter L. Bartlett
We study the problem of identity testing of markov chains.
1 code implementation • 6 Feb 2019 • Yeshwanth Cherapanamjeri, Nicolas Flammarion, Peter L. Bartlett
We propose an estimator for the mean of a random vector in $\mathbb{R}^d$ that can be computed in time $O(n^4+n^2d)$ for $n$ i. i. d.~samples and that has error bounds matching the sub-Gaussian case.
no code implementations • ICML 2017 • Yeshwanth Cherapanamjeri, Kartik Gupta, Prateek Jain
Finally, an application of our result to the robust PCA problem (low-rank+sparse matrix separation) leads to nearly linear time (in matrix dimensions) algorithm for the same; existing state-of-the-art methods require quadratic time.
no code implementations • 18 Feb 2017 • Yeshwanth Cherapanamjeri, Prateek Jain, Praneeth Netrapalli
That is, given a data matrix $M^*$, where $(1-\alpha)$ fraction of the points are noisy samples from a low-dimensional subspace while $\alpha$ fraction of the points can be arbitrary outliers, the goal is to recover the subspace accurately.
no code implementations • 23 Jun 2016 • Yeshwanth Cherapanamjeri, Kartik Gupta, Prateek Jain
Finally, an application of our result to the robust PCA problem (low-rank+sparse matrix separation) leads to nearly linear time (in matrix dimensions) algorithm for the same; existing state-of-the-art methods require quadratic time.