Search Results for author: Haoming Jiang

Found 43 papers, 21 papers with code

Deep Reinforcement Learning with Smooth Policy

no code implementations • ICML 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao

In contrast to policy parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MEMORYLLM: Towards Self-Updatable Large Language Models

no code implementations • 7 Feb 2024 • Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian McAuley

We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently.

Model Editing

Paper
Add Code

SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

1 code implementation • 25 Oct 2023 • Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha

We conduct extensive experiments in both event type prediction and uncertainty quantification of arrival time.

Type prediction Uncertainty Quantification

Paper
Code

Situated Natural Language Explanations

no code implementations • 27 Aug 2023 • Zining Zhu, Haoming Jiang, Jingfeng Yang, Sreyashi Nag, Chao Zhang, Jie Huang, Yifan Gao, Frank Rudzicz, Bing Yin

Situated NLE provides a perspective and facilitates further research on the generation and evaluation of explanations.

Prompt Engineering

Paper
Add Code

Amazon-M2: A Multilingual Multi-locale Shopping Session Dataset for Recommendation and Text Generation

1 code implementation • NeurIPS 2023 • Wei Jin, Haitao Mao, Zheng Li, Haoming Jiang, Chen Luo, Hongzhi Wen, Haoyu Han, Hanqing Lu, Zhengyang Wang, Ruirui Li, Zhen Li, Monica Xiao Cheng, Rahul Goutam, Haiyang Zhang, Karthik Subbian, Suhang Wang, Yizhou Sun, Jiliang Tang, Bing Yin, Xianfeng Tang

To test the potential of the dataset, we introduce three tasks in this work: (1) next-product recommendation, (2) next-product recommendation with domain shifts, and (3) next-product title generation.

Product Recommendation Session-Based Recommendations +1

Paper
Code

Graph Reasoning for Question Answering with Triplet Retrieval

no code implementations • 30 May 2023 • Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin

State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e. g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering.

Knowledge Graphs Question Answering +1

Paper
Add Code

CCGen: Explainable Complementary Concept Generation in E-Commerce

no code implementations • 19 May 2023 • Jie Huang, Yifan Gao, Zheng Li, Jingfeng Yang, Yangqiu Song, Chao Zhang, Zining Zhu, Haoming Jiang, Kevin Chen-Chuan Chang, Bing Yin

We propose and study Complementary Concept Generation (CCGen): given a concept of interest, e. g., "Digital Cameras", generating a list of complementary concepts, e. g., 1) Camera Lenses 2) Batteries 3) Camera Cases 4) Memory Cards 5) Battery Chargers.

Paper
Add Code

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

1 code implementation • 26 Apr 2023 • Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu

This paper presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream natural language processing (NLP) tasks.

Language Modelling Natural Language Understanding +1

8,818

Paper
Code

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

no code implementations • 19 Feb 2023 • Chen Liang, Haoming Jiang, Zheng Li, Xianfeng Tang, Bin Yin, Tuo Zhao

Since the teacher model has a significantly larger capacity and stronger representation power than the student model, it is very difficult for the student to produce predictions that match the teacher's over a massive amount of open-domain training data.

Knowledge Distillation Model Compression +1

Paper
Add Code

Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding

no code implementations • 8 Oct 2022 • Haoming Jiang, Tianyu Cao, Zheng Li, Chen Luo, Xianfeng Tang, Qingyu Yin, Danqing Zhang, Rahul Goutam, Bing Yin

When applying masking to short search queries, most contextual information is lost and the intent of the search queries may be changed.

token-classification Token Classification

Paper
Add Code

DiP-GNN: Discriminative Pre-Training of Graph Neural Networks

no code implementations • 15 Sep 2022 • Simiao Zuo, Haoming Jiang, Qingyu Yin, Xianfeng Tang, Bing Yin, Tuo Zhao

Specifically, we train a generator to recover identities of the masked edges, and simultaneously, we train a discriminator to distinguish the generated edges from the original graph's edges.

Node Classification

Paper
Add Code

Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites

no code implementations • 15 Sep 2022 • Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao

The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network.

Graph Attention

Paper
Add Code

Condensing Graphs via One-Step Gradient Matching

3 code implementations • 15 Jun 2022 • Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Bing Yin

However, existing approaches have their inherent limitations: (1) they are not directly applicable to graphs where the data is discrete; and (2) the condensation process is computationally expensive due to the involved nested optimization.

Dataset Condensation

1,213

Paper
Code

SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models

1 code implementation • Findings (NAACL) 2022 • Jingfeng Yang, Haoming Jiang, Qingyu Yin, Danqing Zhang, Bing Yin, Diyi Yang

SeqZero achieves SOTA performance of BART-based models on GeoQuery and EcommerceQuery, which are two few-shot datasets with compositional data split.

Out-of-Distribution Generalization Semantic Parsing

Paper
Code

Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment

1 code implementation • ACL 2022 • Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang

In this paper, we explore multilingual KG completion, which leverages limited seed alignment as a bridge, to embrace the collective knowledge from multiple languages.

Ranked #3 on Knowledge Graph Completion on DPB-5L (French)

Knowledge Graph Completion

Paper
Code

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

1 code implementation • ICLR 2022 • Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Analysis shows that the proposed schedule indeed reduces the redundancy and improves generalization performance.

Image Classification Machine Translation +2

Paper
Code

Self-Training with Differentiable Teacher

no code implementations • Findings (NAACL) 2022 • Simiao Zuo, Yue Yu, Chen Liang, Haoming Jiang, Siawpeng Er, Chao Zhang, Tuo Zhao, Hongyuan Zha

In self-training, the student contributes to the prediction performance, and the teacher controls the training process by generating pseudo-labels.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

ARCH: Efficient Adversarial Regularized Training with Caching

1 code implementation • Findings (EMNLP) 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization can improve model generalization in many natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

1 code implementation • ACL 2021 • Haoming Jiang, Danqing Zhang, Tianyu Cao, Bing Yin, Tuo Zhao

Unfortunately, we observe that weakly labeled data does not necessarily improve, or even deteriorate the model performance (due to the extensive noise in the weak labels) when we train deep NER models over a simple or weighted combination of the strongly labeled and weakly labeled data.

named-entity-recognition Named Entity Recognition +1

Paper
Code

Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization

1 code implementation • ACL 2021 • Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao, Weizhu Chen

The Lottery Ticket Hypothesis suggests that an over-parametrized network consists of ``lottery tickets'', and training a certain collection of them (i. e., a subnetwork) can match the performance of the full model.

Model Compression Multi-Task Learning

Paper
Code

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

1 code implementation • EMNLP 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

Token-wise Curriculum Learning for Neural Machine Translation

no code implementations • Findings (EMNLP) 2021 • Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Tuo Zhao

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of "easy" samples from training data at the early training stage.

Machine Translation NMT +2

Paper
Add Code

Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach

1 code implementation • EMNLP 2021 • Haoming Jiang, Bo Dai, Mengjiao Yang, Tuo Zhao, Wei Wei

An ideal environment for evaluating dialog systems, also known as the Turing test, needs to involve human interaction, which is usually not affordable for large-scale experiments.

Model-based Reinforcement Learning Off-policy evaluation +2

33,128

Paper
Code

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

1 code implementation • EMNLP 2020 • Lingkai Kong, Haoming Jiang, Yuchen Zhuang, Jie Lyu, Tuo Zhao, Chao Zhang

Fine-tuned pre-trained language models can suffer from severe miscalibration for both in-distribution and out-of-distribution (OOD) data due to over-parameterization.

Language Modelling Out of Distribution (OOD) Detection +2

Paper
Code

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

1 code implementation • NAACL 2021 • Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao, Chao Zhang

To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision.

Ranked #1 on Word Sense Disambiguation on Words in Context

Language Modelling Sentence +2

199

Paper
Code

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

1 code implementation • 28 Jun 2020 • Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao, Chao Zhang

We study the open-domain named entity recognition (NER) problem under distant supervision.

Ranked #1 on Weakly-Supervised Named Entity Recognition on CoNLL03

Language Modelling named-entity-recognition +3

289

Paper
Code

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

1 code implementation • 27 Jun 2020 • Jason Ge, Xingguo Li, Haoming Jiang, Han Liu, Tong Zhang, Mengdi Wang, Tuo Zhao

We describe a new library named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (e. g., sparse linear regression, sparse logistic regression, sparse Poisson regression and scaled sparse linear regression) combined with efficient active set selection strategies.

regression Sparse Learning

Paper
Code

Deep Reinforcement Learning with Robust and Smooth Policy

no code implementations • 21 Mar 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao

Deep reinforcement learning (RL) has achieved great empirical successes in various domains.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Transformer Hawkes Process

3 code implementations • ICML 2020 • Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha

Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets.

Computational Efficiency Point Processes

164

Paper
Code

Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds

no code implementations • NeurIPS 2019 • Minshuo Chen, Haoming Jiang, Wenjing Liao, Tuo Zhao

The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function.

Paper
Add Code

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

6 code implementations • ACL 2020 • Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao

However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.

Ranked #1 on Natural Language Inference on QNLI

Linguistic Acceptability Natural Language Inference +4

2,211

Paper
Code

Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing

1 code implementation • ACL 2020 • Haoming Jiang, Chen Liang, Chong Wang, Tuo Zhao

To overcome this limitation, we propose a novel multi-domain NMT model using individual modules for each domain, on which we apply word-level, adaptive and layer-wise domain mixing.

Machine Translation NMT +3

Paper
Code

Contextual Text Denoising with Masked Language Model

no code implementations • WS 2019 • Yifu Sun, Haoming Jiang

Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks.

Denoising Language Modelling

Paper
Add Code

Contextual Text Denoising with Masked Language Models

no code implementations • 30 Oct 2019 • Yifu Sun, Haoming Jiang

Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks.

Denoising Language Modelling

Paper
Add Code

Meta Learning with Relational Information for Short Sequences

1 code implementation • NeurIPS 2019 • Yujia Xie, Haoming Jiang, Feng Liu, Tuo Zhao, Hongyuan Zha

This paper proposes a new meta-learning method -- named HARMLESS (HAwkes Relational Meta LEarning method for Short Sequences) for learning heterogeneous point process models from short event sequence data along with a relational network.

Meta-Learning

Paper
Code

On the Variance of the Adaptive Learning Rate and Beyond

21 code implementations • ICLR 2020 • Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han

The learning rate warmup heuristic achieves remarkable success in stabilizing training, accelerating convergence and improving generalization for adaptive stochastic optimization algorithms like RMSprop and Adam.

Image Classification Language Modelling +3

49,301

Paper
Code

Nonparametric Regression on Low-Dimensional Manifolds using Deep ReLU Networks : Function Approximation and Statistical Recovery

no code implementations • NeurIPS 2019 • Minshuo Chen, Haoming Jiang, Wenjing Liao, Tuo Zhao

It therefore demonstrates the adaptivity of deep ReLU networks to low-dimensional geometric structures of data, and partially explains the power of deep ReLU networks in tackling high-dimensional data with low-dimensional geometric structures.

regression

Paper
Add Code

On Computation and Generalization of Generative Adversarial Networks under Spectrum Control

no code implementations • ICLR 2019 • Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao

Generative Adversarial Networks (GANs), though powerful, is hard to train.

Paper
Add Code

On Scalable and Efficient Computation of Large Scale Optimal Transport

no code implementations • ICLR Workshop DeepGenStruct 2019 • Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha

Optimal Transport (OT) naturally arises in many machine learning applications, yet the heavy computational burden limits its wide-spread uses.

Domain Adaptation

Paper
Add Code

Learning to Defense by Learning to Attack

no code implementations • ICLR Workshop DeepGenStruct 2019 • Zhehui Chen, Haoming Jiang, Yuyang Shi, Bo Dai, Tuo Zhao

From the perspective of generative learning, our proposed method can be viewed as learning a deep generative model for generating adversarial samples, which is adaptive to the robust classification.

Adversarial Attack Robust classification

Paper
Add Code

On Computation and Generalization of GANs with Spectrum Control

no code implementations • 28 Dec 2018 • Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao

Specifically, we propose a new reparameterization approach for the weight matrices of the discriminator in GANs, which allows us to directly manipulate the spectra of the weight matrices through various regularizers and constraints, without intensively computing singular value decompositions.

Paper
Add Code

Learning to Defend by Learning to Attack

no code implementations • 3 Nov 2018 • Haoming Jiang, Zhehui Chen, Yuyang Shi, Bo Dai, Tuo Zhao

Adversarial training provides a principled approach for training robust neural networks.

Adversarial Attack Adversarial Defense +3

Paper
Add Code

On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don't Worry About Its Nonsmooth Loss Function

no code implementations • 25 May 2016 • Xingguo Li, Haoming Jiang, Jarvis Haupt, Raman Arora, Han Liu, Mingyi Hong, Tuo Zhao

Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility.

regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.