no code implementations • 3 Apr 2024 • Aaron Mishkin, Mert Pilanci, Mark Schmidt
This improvement is comparable to a square-root of the condition number in the worst case and address criticism that guarantees for stochastic acceleration could be worse than those for SGD.
no code implementations • 2 Mar 2024 • Emi Zeger, Yifei Wang, Aaron Mishkin, Tolga Ergen, Emmanuel Candès, Mert Pilanci
We prove that training neural networks on 1-D data is equivalent to solving a convex Lasso problem with a fixed, explicitly defined dictionary matrix of features.
no code implementations • 6 Feb 2024 • Soheil Hor, Ying Qian, Mert Pilanci, Amin Arbabian
This paper introduces the first theoretical framework for quantifying the efficiency and performance gain opportunity size of adaptive inference algorithms.
no code implementations • 6 Feb 2024 • Sungyoon Kim, Mert Pilanci
In this paper, we study the optimality gap between two-layer ReLU networks regularized with weight decay and their convex relaxations.
1 code implementation • 4 Feb 2024 • Fangzhao Zhang, Mert Pilanci
In this work we study the enhancement of Low Rank Adaptation (LoRA) fine-tuning procedure by introducing a Riemannian preconditioner in its optimization step.
no code implementations • 3 Feb 2024 • Fangzhao Zhang, Mert Pilanci
Diffusion models are becoming widely used in state-of-the-art image, video and audio generation.
1 code implementation • 29 Jan 2024 • Alexandros E. Tzikas, Licio Romao, Mert Pilanci, Alessandro Abate, Mykel J. Kochenderfer
Many machine learning applications require operating on a spatially distributed dataset.
1 code implementation • 19 Dec 2023 • Tolga Ergen, Mert Pilanci
We also show that all the stationary of the nonconvex training objective can be characterized as the global optimum of a subsampled convex program.
1 code implementation • 22 Nov 2023 • Annesha Ghosh, Gordon Wetzstein, Mert Pilanci, Sara Fridovich-Keil
Off-resonance artifacts in magnetic resonance imaging (MRI) are visual distortions that occur when the actual resonant frequencies of spins within the imaging volume differ from the expected frequencies used to encode spatial information.
no code implementations • 18 Nov 2023 • Yifei Wang, Mert Pilanci
Using this convex formulation, we prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
1 code implementation • NeurIPS 2023 • Rajarshi Saha, Varun Srivastava, Mert Pilanci
We propose an algorithm that exploits this structure to obtain a low rank decomposition of any matrix $\mathbf{A}$ as $\mathbf{A} \approx \mathbf{L}\mathbf{R}$, where $\mathbf{L}$ and $\mathbf{R}$ are the low rank factors.
no code implementations • 28 Sep 2023 • Mert Pilanci
In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization.
no code implementations • 1 Sep 2023 • Burak Bartan, Mert Pilanci
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations.
no code implementations • 8 Aug 2023 • Neophytos Charalambides, Hessam Mahdavifar, Mert Pilanci, Alfred O. Hero III
Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance.
no code implementations • 6 Aug 2023 • Neophytos Charalambides, Mert Pilanci, Alfred Hero
This is then used to derive an approximate coded computing approach for first-order methods; known as gradient coding, to accelerate linear regression in the presence of failures in distributed computational networks, \textit{i. e.} stragglers.
1 code implementation • 31 May 2023 • Aaron Mishkin, Mert Pilanci
We show that the global optima of the convex parameterization are given by a polyhedral set and then extend this characterization to the optimal set of the non-convex training objective.
no code implementations • 10 May 2023 • Julio A. Oscanoa, Frank Ong, Siddharth S. Iyer, Zhitao Li, Christopher M. Sandino, Batu Ozturkler, Daniel B. Ennis, Mert Pilanci, Shreyas S. Vasanawala
Results: First, we performed ablation experiments to validate the sketching matrix design on both Cartesian and non-Cartesian datasets.
no code implementations • 6 Mar 2023 • Tolga Ergen, Halil Ibrahim Gulluk, Jonathan Lacotte, Mert Pilanci
We first show that regularized deep threshold network training problems can be equivalently formulated as a standard convex optimization problem, which parallels the LASSO method, provided that the last hidden layer width exceeds a certain threshold.
no code implementations • 27 Feb 2023 • Les Atlas, Nicholas Rasmussen, Felix Schwock, Mert Pilanci
For many machine learning applications, a common input representation is a spectrogram.
1 code implementation • 30 Sep 2022 • Yifei Wang, Yixuan Hua, Emmanuel Candés, Mert Pilanci
For randomly generated data, we show the existence of a phase transition in recovering planted neural network models, which is easy to describe: whenever the ratio between the number of samples and the dimension exceeds a numerical threshold, the recovery succeeds with high probability; otherwise, it fails with high probability.
1 code implementation • 18 Jul 2022 • Batu Ozturkler, Arda Sahiner, Tolga Ergen, Arjun D Desai, Christopher M Sandino, Shreyas Vasanawala, John M Pauly, Morteza Mardani, Mert Pilanci
However, they require several iterations of a large neural network to handle high-dimensional imaging tasks such as 3D MRI.
1 code implementation • 26 May 2022 • Yifei Wang, Peng Chen, Mert Pilanci, Wuchen Li
We study the variational problem in the family of two-layer networks with squared-ReLU activations, towards which we derive a semi-definite programming (SDP) relaxation.
no code implementations • 17 May 2022 • Arda Sahiner, Tolga Ergen, Batu Ozturkler, John Pauly, Morteza Mardani, Mert Pilanci
Vision transformers using self-attention or its proposed alternatives have demonstrated promising results in many image related tasks.
1 code implementation • 21 Apr 2022 • Beliz Gunel, Arda Sahiner, Arjun D. Desai, Akshay S. Chaudhari, Shreyas Vasanawala, Mert Pilanci, John Pauly
Unrolled neural networks have enabled state-of-the-art reconstruction performance and fast inference times for the accelerated magnetic resonance imaging (MRI) reconstruction task.
no code implementations • 18 Mar 2022 • Tavor Z. Baharav, Gary Cheng, Mert Pilanci, David Tse
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-\delta$ returns an $\epsilon$ accurate estimate of $f(\boldsymbol{\mu})$.
no code implementations • 18 Mar 2022 • Burak Bartan, Mert Pilanci
Furthermore, we develop unbiased parameter averaging methods for randomized second order optimization for regularized problems that employ sketching of the Hessian.
no code implementations • 23 Feb 2022 • Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith
We derive an information-theoretic lower bound for the minimax risk under this setting and propose a matching upper bound using randomized embedding-based algorithms which is tight up to constant factors.
1 code implementation • 2 Feb 2022 • Aaron Mishkin, Arda Sahiner, Mert Pilanci
We develop fast algorithms and robust software for convex optimization of two-layer neural networks with ReLU activation functions.
1 code implementation • 21 Jan 2022 • Richard Swartzbaugh, Amil Khanzada, Praveen Govindan, Mert Pilanci, Ayomide Owoyemi, Les Atlas, Hugo Estrada, Richard Nall, Michael Lotito, Rich Falcone, Jennifer Ranjani J
The COVID-19 pandemic has been a scourge upon humanity, claiming the lives of more than 5. 1 million people worldwide; the global economy contracted by 3. 5% in 2020.
no code implementations • 5 Jan 2022 • Esin Darici Haritaoglu, Nicholas Rasmussen, Daniel C. H. Tan, Jennifer Ranjani J., Jaclyn Xiao, Gunvant Chaudhari, Akanksha Rajput, Praveen Govindan, Christian Canham, Wei Chen, Minami Yamaura, Laura Gomezjurado, Aaron Broukhim, Amil Khanzada, Mert Pilanci
The Covid-19 pandemic has been one of the most devastating events in recent history, claiming the lives of more than 5 million people worldwide.
no code implementations • NeurIPS Workshop Deep_Invers 2021 • Batu Ozturkler, Arda Sahiner, Tolga Ergen, Arjun D Desai, John M. Pauly, Shreyas Vasanawala, Morteza Mardani, Mert Pilanci
Model-based deep learning approaches have recently shown state-of-the-art performance for accelerated MRI reconstruction.
no code implementations • ICLR 2022 • Yifei Wang, Mert Pilanci
We then show that the limit points of non-convex subgradient flows can be identified via primal-dual correspondence in this convex optimization problem.
no code implementations • 13 Oct 2021 • Yifei Wang, Tolga Ergen, Mert Pilanci
Recent work has proven that the strong duality holds (which means zero duality gap) for regularized finite-width two-layer ReLU networks and consequently provided an equivalent convex training problem.
no code implementations • 11 Oct 2021 • Tolga Ergen, Mert Pilanci
We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group $\ell_1$-norm regularization.
no code implementations • ICLR 2022 • Yifei Wang, Jonathan Lacotte, Mert Pilanci
As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.
1 code implementation • NeurIPS 2021 • Michał Dereziński, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
1 code implementation • ICLR 2022 • Arda Sahiner, Tolga Ergen, Batu Ozturkler, Burak Bartan, John Pauly, Morteza Mardani, Mert Pilanci
In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games.
no code implementations • 15 May 2021 • Jonathan Lacotte, Yifei Wang, Mert Pilanci
Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix.
no code implementations • 4 May 2021 • Burak Bartan, Mert Pilanci
Neural networks (NNs) have been extremely successful across many tasks in machine learning.
1 code implementation • 29 Apr 2021 • Jonathan Lacotte, Mert Pilanci
We propose an adaptive mechanism to control the sketch size according to the progress made in each step of the iterative solver.
1 code implementation • 13 Mar 2021 • Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith
As a consequence, quantizing these embeddings followed by an inverse transform to the original space yields a source coding method with optimal covering efficiency while utilizing just $R$-bits per dimension.
no code implementations • ICLR 2022 • Tolga Ergen, Arda Sahiner, Batu Ozturkler, John Pauly, Morteza Mardani, Mert Pilanci
Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training of deep neural networks.
no code implementations • 5 Feb 2021 • Li Liu, Xianghao Zhan, Rumeng Wu, Xiaoqing Guan, Zhan Wang, Wei zhang, Mert Pilanci, You Wang, Zhiyuan Luo, Guang Li
Furthermore, this study provides a systematic analysis of different augmentation strategies.
no code implementations • 7 Jan 2021 • Burak Bartan, Mert Pilanci
In this paper, we develop exact convex optimization formulations for two-layer neural networks with second degree polynomial activations based on semidefinite programming.
no code implementations • ICLR 2021 • Arda Sahiner, Tolga Ergen, John Pauly, Mert Pilanci
We describe the convex semi-infinite dual of the two-layer vector-output ReLU neural network training problem.
no code implementations • 13 Dec 2020 • Jonathan Lacotte, Mert Pilanci
We propose novel randomized optimization methods for high-dimensional convex problems based on restrictions of variables to random subspaces.
no code implementations • ICLR 2021 • Arda Sahiner, Morteza Mardani, Batu Ozturkler, Mert Pilanci, John Pauly
Neural networks have shown tremendous potential for reconstructing high-resolution images in inverse problems.
no code implementations • NeurIPS 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci
These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.
no code implementations • 26 Oct 2020 • Lawrence H. Kim, Rahul Goel, Jia Liang, Mert Pilanci, Pablo E. Paredes
This work demonstrates that the damping frequency and damping ratio from LPC are significantly correlated with those from an MSD model, thus confirming the validity of using LPC to infer muscle stiffness and damping.
no code implementations • NeurIPS 2020 • Michał Dereziński, Burak Bartan, Mert Pilanci, Michael W. Mahoney
In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data.
no code implementations • ICLR 2021 • Tolga Ergen, Mert Pilanci
We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension.
no code implementations • 15 Jun 2020 • Srivatsan Sridhar, Mert Pilanci, Ayfer Özgür
An upper bound on the expected error of this estimator is derived, which is smaller than the error of the classical Gaussian sketch solution for any given data.
no code implementations • 10 Jun 2020 • Yifei Wang, Jonathan Lacotte, Mert Pilanci
As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.
no code implementations • NeurIPS 2020 • Jonathan Lacotte, Mert Pilanci
Our method starts with an initial embedding dimension equal to 1 and, over iterations, increases the embedding dimension up to the effective one at most.
no code implementations • 21 May 2020 • Surin Ahn, Ayfer Ozgur, Mert Pilanci
In the domains of dataset construction and crowdsourcing, a notable challenge is to aggregate labels from a heterogeneous set of labelers, each of whom is potentially an expert in some subset of tasks (and less reliable in others).
no code implementations • 25 Feb 2020 • Tolga Ergen, Mert Pilanci
Our analysis also shows that optimal network parameters can be also characterized as interpretable closed-form formulas in some practically relevant special cases.
no code implementations • 25 Feb 2020 • Elaina Chai, Mert Pilanci, Boris Murmann
Batch Normalization (BatchNorm) is commonly used in Convolutional Neural Networks (CNNs) to improve training speed and stability.
no code implementations • ICML 2020 • Mert Pilanci, Tolga Ergen
We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs) in terms of a single convex program with number of variables polynomial in the number of training samples and the number of hidden neurons.
no code implementations • 22 Feb 2020 • Tolga Ergen, Mert Pilanci
We show that a set of optimal hidden layer weights for a norm regularized DNN training problem can be explicitly found as the extreme points of a convex set.
no code implementations • ICML 2020 • Jonathan Lacotte, Mert Pilanci
Then, we propose a new algorithm by optimizing the computational complexity over the choice of the sketching dimension.
no code implementations • 16 Feb 2020 • Burak Bartan, Mert Pilanci
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a significant bottleneck.
no code implementations • 16 Feb 2020 • Burak Bartan, Mert Pilanci
In this work, we study distributed sketching methods for large scale regression problems.
no code implementations • 6 Feb 2020 • Alexandre d'Aspremont, Mert Pilanci
The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.
no code implementations • 3 Feb 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci
These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.
no code implementations • 7 Dec 2019 • Ibrahim Kurban Ozaslan, Mert Pilanci, Orhan Arikan
In this article, Momentum Iterative Hessian Sketch (M-IHS) techniques, a group of solvers for large scale linear Least Squares (LS) problems, are proposed and analyzed in detail.
Optimization and Control Computational Complexity 15B52, 65F08, 65F10, 65F22, 65F50, 68W20, 90C06
no code implementations • 13 Jul 2019 • Burak Bartan, Mert Pilanci
We introduce a novel distributed derivative-free optimization framework that is resilient to stragglers.
no code implementations • NeurIPS 2019 • Jonathan Lacotte, Mert Pilanci, Marco Pavone
We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces.
no code implementations • 21 Jan 2019 • Burak Bartan, Mert Pilanci
We propose a serverless computing mechanism for distributed computation based on polar codes.
no code implementations • 31 Dec 2018 • Burak Bartan, Mert Pilanci
We propose convex relaxations for convolutional neural nets with one hidden layer where the output weights are fixed.
no code implementations • 9 May 2015 • Mert Pilanci, Martin J. Wainwright
We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers.
no code implementations • 25 Jan 2015 • Yun Yang, Mert Pilanci, Martin J. Wainwright
Kernel ridge regression (KRR) is a standard method for performing non-parametric regression over reproducing kernel Hilbert spaces.
no code implementations • 3 Nov 2014 • Mert Pilanci, Martin J. Wainwright
We study randomized sketching methods for approximately solving least-squares problem with a general convex constraint.
no code implementations • 29 Apr 2014 • Mert Pilanci, Martin J. Wainwright
We analyze RP-based approximations of convex programs, in which the original optimization problem is approximated by the solution of a lower-dimensional problem.
no code implementations • NeurIPS 2012 • Mert Pilanci, Laurent E. Ghaoui, Venkat Chandrasekaran
We propose a direct relaxation of the minimum cardinality problem and show that it can be efficiently solved using convex programming.