Search Results for author: Patrick H. Chen

Found 10 papers, 1 papers with code

ELIAS: End-to-End Learning to Index and Search in Large Output Spaces

1 code implementation • 16 Oct 2022 • Nilesh Gupta, Patrick H. Chen, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S Dhillon

A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.

Extreme Multi-Label Classification

Paper
Code

FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search

no code implementations • 22 Jun 2022 • Patrick H. Chen, Chang Wei-cheng, Yu Hsiang-fu, Inderjit S. Dhillon, Hsieh Cho-jui

Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous in modern applications, for example, as a fast search procedure with two tower deep learning models.

Paper
Add Code

Overcoming Catastrophic Forgetting by Generative Regularization

no code implementations • 3 Dec 2019 • Patrick H. Chen, Wei Wei, Cho-Jui Hsieh, Bo Dai

In this paper, we propose a new method to overcome catastrophic forgetting by adding generative regularization to Bayesian inference framework.

Bayesian Inference Continual Learning

Paper
Add Code

MulCode: A Multiplicative Multi-way Model for Compressing Neural Language Model

no code implementations • IJCNLP 2019 • Yukun Ma, Patrick H. Chen, Cho-Jui Hsieh

For example, input embedding and Softmax matrices in IWSLT-2014 German-to-English data set account for more than 80{\%} of the total model parameters.

Language Modelling Machine Translation +2

Paper
Add Code

LEARNING TO LEARN WITH BETTER CONVERGENCE

no code implementations • 25 Sep 2019 • Patrick H. Chen, Sashank Reddi, Sanjiv Kumar, Cho-Jui Hsieh

We consider the learning to learn problem, where the goal is to leverage deeplearning models to automatically learn (iterative) optimization algorithms for training machine learning models.

Paper
Add Code

Efficient Contextual Representation Learning With Continuous Outputs

no code implementations • TACL 2019 • Liunian Harold Li, Patrick H. Chen, Cho-Jui Hsieh, Kai-Wei Chang

Contextual representation models have achieved great success in improving various downstream natural language processing tasks.

Language Modelling Representation Learning +1

Paper
Add Code

Efficient Contextual Representation Learning Without Softmax Layer

no code implementations • 28 Feb 2019 • Liunian Harold Li, Patrick H. Chen, Cho-Jui Hsieh, Kai-Wei Chang

Our framework reduces the time spent on the output layer to a negligible level, eliminates almost all the trainable parameters of the softmax layer and performs language modeling without truncating the vocabulary.

Dimensionality Reduction Language Modelling +2

Paper
Add Code

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

no code implementations • ICLR 2019 • Patrick H. Chen, Si Si, Sanjiv Kumar, Yang Li, Cho-Jui Hsieh

The algorithm achieves an order of magnitude faster inference than the original softmax layer for predicting top-$k$ words in various tasks such as beam search in machine translation or next words prediction.

Clustering Machine Translation +1

Paper
Add Code

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

no code implementations • NeurIPS 2018 • Patrick H. Chen, Si Si, Yang Li, Ciprian Chelba, Cho-Jui Hsieh

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses.

Language Modelling Model Compression +1

Paper
Add Code

A comparison of second-order methods for deep convolutional neural networks

no code implementations • ICLR 2018 • Patrick H. Chen, Cho-Jui Hsieh

Despite many second-order methods have been proposed to train neural networks, most of the results were done on smaller single layer fully connected networks, so we still cannot conclude whether it's useful in training deep convolutional networks.

Second-order methods

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.