no code implementations • ICML 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao
In contrast to policy parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy.
no code implementations • 7 Feb 2024 • Yu Wang, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian McAuley
We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently.
1 code implementation • 25 Oct 2023 • Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha
We conduct extensive experiments in both event type prediction and uncertainty quantification of arrival time.
no code implementations • 27 Aug 2023 • Zining Zhu, Haoming Jiang, Jingfeng Yang, Sreyashi Nag, Chao Zhang, Jie Huang, Yifan Gao, Frank Rudzicz, Bing Yin
Situated NLE provides a perspective and facilitates further research on the generation and evaluation of explanations.
1 code implementation • NeurIPS 2023 • Wei Jin, Haitao Mao, Zheng Li, Haoming Jiang, Chen Luo, Hongzhi Wen, Haoyu Han, Hanqing Lu, Zhengyang Wang, Ruirui Li, Zhen Li, Monica Xiao Cheng, Rahul Goutam, Haiyang Zhang, Karthik Subbian, Suhang Wang, Yizhou Sun, Jiliang Tang, Bing Yin, Xianfeng Tang
To test the potential of the dataset, we introduce three tasks in this work: (1) next-product recommendation, (2) next-product recommendation with domain shifts, and (3) next-product title generation.
no code implementations • 30 May 2023 • Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin
State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e. g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering.
no code implementations • 19 May 2023 • Jie Huang, Yifan Gao, Zheng Li, Jingfeng Yang, Yangqiu Song, Chao Zhang, Zining Zhu, Haoming Jiang, Kevin Chen-Chuan Chang, Bing Yin
We propose and study Complementary Concept Generation (CCGen): given a concept of interest, e. g., "Digital Cameras", generating a list of complementary concepts, e. g., 1) Camera Lenses 2) Batteries 3) Camera Cases 4) Memory Cards 5) Battery Chargers.
1 code implementation • 26 Apr 2023 • Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu
This paper presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream natural language processing (NLP) tasks.
no code implementations • 19 Feb 2023 • Chen Liang, Haoming Jiang, Zheng Li, Xianfeng Tang, Bin Yin, Tuo Zhao
Since the teacher model has a significantly larger capacity and stronger representation power than the student model, it is very difficult for the student to produce predictions that match the teacher's over a massive amount of open-domain training data.
no code implementations • 8 Oct 2022 • Haoming Jiang, Tianyu Cao, Zheng Li, Chen Luo, Xianfeng Tang, Qingyu Yin, Danqing Zhang, Rahul Goutam, Bing Yin
When applying masking to short search queries, most contextual information is lost and the intent of the search queries may be changed.
no code implementations • 15 Sep 2022 • Simiao Zuo, Haoming Jiang, Qingyu Yin, Xianfeng Tang, Bing Yin, Tuo Zhao
Specifically, we train a generator to recover identities of the masked edges, and simultaneously, we train a discriminator to distinguish the generated edges from the original graph's edges.
no code implementations • 15 Sep 2022 • Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao
The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network.
3 code implementations • 15 Jun 2022 • Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Bing Yin
However, existing approaches have their inherent limitations: (1) they are not directly applicable to graphs where the data is discrete; and (2) the condensation process is computationally expensive due to the involved nested optimization.
1 code implementation • Findings (NAACL) 2022 • Jingfeng Yang, Haoming Jiang, Qingyu Yin, Danqing Zhang, Bing Yin, Diyi Yang
SeqZero achieves SOTA performance of BART-based models on GeoQuery and EcommerceQuery, which are two few-shot datasets with compositional data split.
1 code implementation • ACL 2022 • Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang
In this paper, we explore multilingual KG completion, which leverages limited seed alignment as a bridge, to embrace the collective knowledge from multiple languages.
Ranked #3 on Knowledge Graph Completion on DPB-5L (French)
1 code implementation • ICLR 2022 • Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao
Analysis shows that the proposed schedule indeed reduces the redundancy and improves generalization performance.
no code implementations • Findings (NAACL) 2022 • Simiao Zuo, Yue Yu, Chen Liang, Haoming Jiang, Siawpeng Er, Chao Zhang, Tuo Zhao, Hongyuan Zha
In self-training, the student contributes to the prediction performance, and the teacher controls the training process by generating pseudo-labels.
1 code implementation • Findings (EMNLP) 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao
Adversarial regularization can improve model generalization in many natural language processing tasks.
1 code implementation • ACL 2021 • Haoming Jiang, Danqing Zhang, Tianyu Cao, Bing Yin, Tuo Zhao
Unfortunately, we observe that weakly labeled data does not necessarily improve, or even deteriorate the model performance (due to the extensive noise in the weak labels) when we train deep NER models over a simple or weighted combination of the strongly labeled and weakly labeled data.
1 code implementation • ACL 2021 • Chen Liang, Simiao Zuo, Minshuo Chen, Haoming Jiang, Xiaodong Liu, Pengcheng He, Tuo Zhao, Weizhu Chen
The Lottery Ticket Hypothesis suggests that an over-parametrized network consists of ``lottery tickets'', and training a certain collection of them (i. e., a subnetwork) can match the performance of the full model.
1 code implementation • EMNLP 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao
Adversarial regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks.
no code implementations • Findings (EMNLP) 2021 • Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Tuo Zhao
Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of "easy" samples from training data at the early training stage.
1 code implementation • EMNLP 2021 • Haoming Jiang, Bo Dai, Mengjiao Yang, Tuo Zhao, Wei Wei
An ideal environment for evaluating dialog systems, also known as the Turing test, needs to involve human interaction, which is usually not affordable for large-scale experiments.
1 code implementation • EMNLP 2020 • Lingkai Kong, Haoming Jiang, Yuchen Zhuang, Jie Lyu, Tuo Zhao, Chao Zhang
Fine-tuned pre-trained language models can suffer from severe miscalibration for both in-distribution and out-of-distribution (OOD) data due to over-parameterization.
1 code implementation • NAACL 2021 • Yue Yu, Simiao Zuo, Haoming Jiang, Wendi Ren, Tuo Zhao, Chao Zhang
To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision.
Ranked #1 on Word Sense Disambiguation on Words in Context
1 code implementation • 28 Jun 2020 • Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao, Chao Zhang
We study the open-domain named entity recognition (NER) problem under distant supervision.
1 code implementation • 27 Jun 2020 • Jason Ge, Xingguo Li, Haoming Jiang, Han Liu, Tong Zhang, Mengdi Wang, Tuo Zhao
We describe a new library named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (e. g., sparse linear regression, sparse logistic regression, sparse Poisson regression and scaled sparse linear regression) combined with efficient active set selection strategies.
no code implementations • 21 Mar 2020 • Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao
Deep reinforcement learning (RL) has achieved great empirical successes in various domains.
3 code implementations • ICML 2020 • Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, Hongyuan Zha
Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets.
no code implementations • NeurIPS 2019 • Minshuo Chen, Haoming Jiang, Wenjing Liao, Tuo Zhao
The network size scales exponentially in the approximation error, with an exponent depending on the intrinsic dimension of the data and the smoothness of the function.
6 code implementations • ACL 2020 • Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao
However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.
Ranked #1 on Natural Language Inference on QNLI
1 code implementation • ACL 2020 • Haoming Jiang, Chen Liang, Chong Wang, Tuo Zhao
To overcome this limitation, we propose a novel multi-domain NMT model using individual modules for each domain, on which we apply word-level, adaptive and layer-wise domain mixing.
no code implementations • WS 2019 • Yifu Sun, Haoming Jiang
Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks.
no code implementations • 30 Oct 2019 • Yifu Sun, Haoming Jiang
Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks.
1 code implementation • NeurIPS 2019 • Yujia Xie, Haoming Jiang, Feng Liu, Tuo Zhao, Hongyuan Zha
This paper proposes a new meta-learning method -- named HARMLESS (HAwkes Relational Meta LEarning method for Short Sequences) for learning heterogeneous point process models from short event sequence data along with a relational network.
21 code implementations • ICLR 2020 • Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han
The learning rate warmup heuristic achieves remarkable success in stabilizing training, accelerating convergence and improving generalization for adaptive stochastic optimization algorithms like RMSprop and Adam.
no code implementations • NeurIPS 2019 • Minshuo Chen, Haoming Jiang, Wenjing Liao, Tuo Zhao
It therefore demonstrates the adaptivity of deep ReLU networks to low-dimensional geometric structures of data, and partially explains the power of deep ReLU networks in tackling high-dimensional data with low-dimensional geometric structures.
no code implementations • ICLR 2019 • Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao
Generative Adversarial Networks (GANs), though powerful, is hard to train.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha
Optimal Transport (OT) naturally arises in many machine learning applications, yet the heavy computational burden limits its wide-spread uses.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Zhehui Chen, Haoming Jiang, Yuyang Shi, Bo Dai, Tuo Zhao
From the perspective of generative learning, our proposed method can be viewed as learning a deep generative model for generating adversarial samples, which is adaptive to the robust classification.
no code implementations • 28 Dec 2018 • Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao
Specifically, we propose a new reparameterization approach for the weight matrices of the discriminator in GANs, which allows us to directly manipulate the spectra of the weight matrices through various regularizers and constraints, without intensively computing singular value decompositions.
no code implementations • 3 Nov 2018 • Haoming Jiang, Zhehui Chen, Yuyang Shi, Bo Dai, Tuo Zhao
Adversarial training provides a principled approach for training robust neural networks.
no code implementations • 25 May 2016 • Xingguo Li, Haoming Jiang, Jarvis Haupt, Raman Arora, Han Liu, Mingyi Hong, Tuo Zhao
Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility.