no code implementations • 8 May 2024 • Kai Zheng, Haijun Zhao, Rui Huang, Beichuan Zhang, Na Mou, Yanan Niu, Yang song, Hongning Wang, Kun Gai
To address this issue, we propose an improved ranking principle for multi-stage systems, namely the Generalized Probability Ranking Principle (GPRP), to emphasize both the selection bias in each stage of the system pipeline as well as the underlying interest of users.
no code implementations • 5 May 2024 • Zhendong Chu, Zichao Wang, Ruiyi Zhang, Yangfeng Ji, Hongning Wang, Tong Sun
Large language models (LLMs) have demonstrated impressive zero-shot abilities in solving a wide range of general-purpose tasks.
no code implementations • 28 Apr 2024 • Fan Yao, Yiming Liao, Mingzhe Wu, Chuanhao Li, Yan Zhu, James Yang, Qifan Wang, Haifeng Xu, Hongning Wang
Driven by the new economic opportunities created by the creator economy, an increasing number of content creators rely on and compete for revenue generated from online content recommendation platforms.
no code implementations • 1 Apr 2024 • Zhenyu Hou, Yilin Niu, Zhengxiao Du, Xiaohan Zhang, Xiao Liu, Aohan Zeng, Qinkai Zheng, Minlie Huang, Hongning Wang, Jie Tang, Yuxiao Dong
The work presents our practices of aligning LLMs with human preferences, offering insights into the challenges and solutions in RLHF implementations.
no code implementations • 8 Mar 2024 • Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu
We introduce Adversarial Policy Optimization (AdvPO), a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs).
no code implementations • 29 Feb 2024 • Ethan Blaser, Chuanhao Li, Hongning Wang
The demand for collaborative and private bandit learning across multiple agents is surging due to the growing quantity of data generated from distributed systems.
1 code implementation • 26 Feb 2024 • Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang
The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner.
no code implementations • 21 Feb 2024 • Zhiwei Wang, Huazheng Wang, Hongning Wang
Our analysis shows that against two popularly employed MAB algorithms, UCB1 and $\epsilon$-greedy, the success of a stealthy attack depends on the environmental conditions and the realized reward of the arm pulled in the first round.
no code implementations • 7 Feb 2024 • Zhepei Wei, Chuanhao Li, Tianze Ren, Haifeng Xu, Hongning Wang
To enhance the efficiency and practicality of federated bandit learning, recent advances have introduced incentives to motivate communication among clients, where a client participates only when the incentive offered by the server outweighs its participation cost.
no code implementations • 2 Feb 2024 • Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang
We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process.
2 code implementations • 1 Feb 2024 • Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang
We prove that EXO is guaranteed to optimize in the same direction as the RL algorithms asymptotically for arbitary parametrization of the policy, while enables efficient optimization by circumventing the complexities associated with RL algorithms.
no code implementations • 28 Jan 2024 • Anat Hashavit, Tamar Stern, Hongning Wang, Sarit Kraus
These results strongly suggest that an information need-focused approach can significantly improve the reliability of extracted snippets in online health search.
2 code implementations • 30 Nov 2023 • Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang
Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets.
1 code implementation • 30 Nov 2023 • Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang
We will provide public APIs for evaluating AlignBench with CritiqueLLM to facilitate the evaluation of LLMs' Chinese alignment.
1 code implementation • 7 Nov 2023 • Jiale Cheng, Xiao Liu, Kehan Zheng, Pei Ke, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang
However, these models are often not well aligned with human intents, which calls for additional treatments on them, that is, the alignment problem.
no code implementations • 2 Oct 2023 • Haozhe Ji, Pei Ke, Hongning Wang, Minlie Huang
And most importantly, we prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
no code implementations • 15 Feb 2023 • Zhendong Chu, Hongning Wang
In this paper, we explore the structured heterogeneity among tasks via clustering to improve meta-RL.
1 code implementation • 10 Feb 2023 • Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo
Confounding bias arises due to the presence of unmeasured variables (e. g., the socio-economic status of a user) that can affect both a user's exposure and feedback.
no code implementations • 3 Feb 2023 • Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, Haifeng Xu
Content creators compete for exposure on recommendation platforms, and such strategic behavior leads to a dynamic shift over the content distribution.
1 code implementation • 13 Jan 2023 • Xiaoying Zhang, Hongning Wang, Hang Li
This calls for a fine-grained understanding of a user's preferences over items, where one needs to recognize the user's choice is driven by the quality of the item itself, or the pre-selected attributes of the item.
no code implementations • 6 Nov 2022 • Ye Gao, Zhendong Chu, Hongning Wang, John Stankovic
We extend the theory of GAN to show that there exist optimal solutions for the parameters of the two discriminators and one generator in MiddleGAN, and empirically show that the samples generated by the MiddleGAN are similar to both samples from the source domain and samples from the target domain.
no code implementations • 14 Oct 2022 • Nan Wang, Qifan Wang, Yi-Chia Wang, Maziar Sanjabi, Jingzhou Liu, Hamed Firooz, Hongning Wang, Shaoliang Nie
However, the bias inherent in user written text, often used for PTG model training, can inadvertently associate different levels of linguistic quality with users' protected attributes.
1 code implementation • 2 Oct 2022 • Lu Lin, Jinghui Chen, Hongning Wang
Graph contrastive learning (GCL), as an emerging self-supervised learning technique on graphs, aims to learn representations via instance discrimination.
1 code implementation • 31 Aug 2022 • A S M Ahsan-Ul Haque, Hongning Wang
Fourthly, when the user rejects a recommendation, we adaptively choose the next decision tree to improve subsequent questions and recommendations.
no code implementations • 30 Aug 2022 • Huazheng Wang, David Zhao, Hongning Wang
We provide a rigorous theoretical analysis over the amount of noise added via dynamic global sensitivity and the corresponding upper regret bound of our proposed algorithm.
no code implementations • 10 Jul 2022 • Anat Hashavit, Hongning Wang, Tamar Stern, Sarit Kraus
We further discover that the contrast between the indirect marketing ads and the viewpoint presented in the organic search results plays an important role in users' decision-making.
no code implementations • 13 Jun 2022 • Yiling Jia, Hongning Wang
Deep neural networks (DNNs) demonstrate significant advantages in improving ranking performance in retrieval tasks.
no code implementations • 10 Jun 2022 • Chuanhao Li, Huazheng Wang, Mengdi Wang, Hongning Wang
We tackle the communication efficiency challenge of learning kernelized contextual bandits in a distributed setting.
1 code implementation • 24 May 2022 • Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
We propose to learn a meta policy and adapt it to new users with only a few trials of conversational recommendations.
no code implementations • 20 Feb 2022 • Peng Wang, Renqin Cai, Hongning Wang
Explanations in a recommender system assist users in making informed decisions among a set of recommended items.
no code implementations • 3 Feb 2022 • Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, Haifeng Xu
In real-world recommendation problems, especially those with a formidably large item space, users have to gradually learn to estimate the utility of any fresh recommendations from their experience about previously consumed items.
no code implementations • 2 Feb 2022 • Chuanhao Li, Hongning Wang
Contextual bandit algorithms have been recently studied under the federated learning setting to satisfy the demand of keeping data decentralized and pushing the learning of bandit models to the client side.
no code implementations • 24 Jan 2022 • Nan Wang, Hongning Wang, Maryam Karimzadehgan, Branislav Kveton, Craig Boutilier
This problem has been studied extensively in the setting of known objective functions.
no code implementations • ICLR 2022 • Yiling Jia, Weitong Zhang, Dongruo Zhou, Quanquan Gu, Hongning Wang
Thanks to the power of representation learning, neural contextual bandit algorithms demonstrate remarkable performance improvement against their classical counterparts.
no code implementations • 24 Jan 2022 • Ye Gao, Brian Baucom, Karen Rose, Kristina Gordon, Hongning Wang, John Stankovic
In the computer vision modality, the evaluation results suggest that we achieve new state-of-the-art performance on popular UDA benchmarks such as Office-31 and Office-Home, outperforming the second best-performing algorithms by up to 17. 9%.
Out-of-Distribution Detection Unsupervised Domain Adaptation
no code implementations • 17 Jan 2022 • Yiling Jia, Hongning Wang
Existing online learning to rank (OL2R) solutions are limited to linear models, which are incompetent to capture possible non-linear relations between queries and documents.
no code implementations • 1 Nov 2021 • Yiling Jia, Hongning Wang
Online learning to rank (OL2R) has attracted great research interests in recent years, thanks to its advantages in avoiding expensive relevance labeling as required in offline supervised ranking model learning.
no code implementations • 1 Nov 2021 • Aobo Yang, Nan Wang, Renqin Cai, Hongbo Deng, Hongning Wang
As recommendation is essentially a comparative (or ranking) process, a good explanation should illustrate to users why an item is believed to be better than another, i. e., comparative explanations about the recommended items.
1 code implementation • 1 Nov 2021 • Lu Lin, Ethan Blaser, Hongning Wang
Graph Convolutional Networks (GCNs) have fueled a surge of research interest due to their encouraging performance on graph learning tasks, but they are also shown vulnerability to adversarial attacks.
no code implementations • 31 Oct 2021 • Lu Lin, Ethan Blaser, Hongning Wang
The exploitation of graph structures is the key to effectively learning representations of nodes that preserve useful information in graphs.
no code implementations • 26 Oct 2021 • Nan Wang, Lu Lin, Jundong Li, Hongning Wang
In this paper, we propose a principled new way for unbiased graph embedding by learning node embeddings from an underlying bias-free graph, which is not influenced by sensitive node attributes.
no code implementations • 18 Oct 2021 • Huazheng Wang, Haifeng Xu, Hongning Wang
We study adversarial attacks on linear stochastic bandits: by manipulating the rewards, an adversary aims to control the behaviour of the bandit algorithm.
no code implementations • 6 Oct 2021 • Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, Haifeng Xu
We propose a new problem setting to study the sequential interactions between a recommender system and a user.
no code implementations • 4 Oct 2021 • Chuanhao Li, Hongning Wang
In this paper, we study linear contextual bandit in a federated learning setting.
no code implementations • 22 Jul 2021 • Zhendong Chu, Hongning Wang
This creates a sparsity issue and limits the quality of machine learning models trained on such data.
no code implementations • 14 Apr 2021 • Chuanhao Li, Qingyun Wu, Hongning Wang
However, all existing collaborative bandit learning solutions impose a stationary assumption about the environment, i. e., both user preferences and the dependency among users are assumed static over time.
no code implementations • 8 Apr 2021 • Huazheng Wang, Haifeng Xu, Chuanhao Li, Zhiyuan Liu, Hongning Wang
We study the problem of incentivizing exploration for myopic users in linear bandits, where the users tend to exploit arm with the highest predicted reward instead of exploring.
1 code implementation • 28 Feb 2021 • Yiling Jia, Huazheng Wang, Stephen Guo, Hongning Wang
Online Learning to Rank (OL2R) eliminates the need of explicit relevance annotation by directly optimizing the rankers from their interactions with users.
no code implementations • 14 Feb 2021 • Fan Yao, Renqin Cai, Hongning Wang
Combinatorial optimization problem (COP) over graphs is a fundamental challenge in optimization.
no code implementations • 24 Jan 2021 • Aobo Yang, Nan Wang, Hongbo Deng, Hongning Wang
At training time, the two learning tasks are joined by a latent sentiment vector, which is encoded by the recommendation module and used to make word choices for explanation generation.
2 code implementations • 24 Dec 2020 • Zhendong Chu, Jing Ma, Hongning Wang
Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost.
Ranked #1 on Image Classification on LabelMe
no code implementations • 5 Sep 2020 • Chuanhao Li, Qingyun Wu, Hongning Wang
Non-stationary bandits and online clustering of bandits lift the restrictive assumptions in contextual bandits and provide solutions to many important real-world scenarios.
no code implementations • 9 Jun 2020 • Nan Wang, Hongning Wang
In this work, we propose a directional multi-aspect ranking criterion to enable a holistic ranking of items with respect to multiple aspects.
no code implementations • 18 May 2020 • Nan Wang, Zhen Qin, Xuanhui Wang, Hongning Wang
Recent advances in unbiased learning to rank (LTR) count on Inverse Propensity Scoring (IPS) to eliminate bias in implicit feedback.
no code implementations • 29 Jan 2020 • Jibang Wu, Renqin Cai, Hongning Wang
Predicting users' preferences based on their sequential behaviors in history is challenging and crucial for modern recommender systems.
1 code implementation • 1 Dec 2019 • Lin Gong, Lu Lin, Weihao Song, Hongning Wang
Inspired by the concept of user schema in social psychology, we take a new perspective to perform user representation learning by constructing a shared latent space to capture the dependency among different modalities of user-generated data.
1 code implementation • NeurIPS 2019 • Xueying Bai, Jian Guan, Hongning Wang
Reinforcement learning is effective in optimizing policies for recommender systems.
Generative Adversarial Network Model-based Reinforcement Learning +3
3 code implementations • NeurIPS 2019 • Xueying Bai, Jian Guan, Hongning Wang
Reinforcement learning is well suited for optimizing policies of recommender systems.
Generative Adversarial Network Model-based Reinforcement Learning +3
no code implementations • 18 Sep 2019 • Nan Wang, Hongning Wang
The framework naturally leads to a probabilistic multi-aspect ranking criterion, which generalizes the single-aspect ranking to a multivariate fashion.
1 code implementation • 2 Sep 2019 • Yiling Jia, Nipun Batra, Hongning Wang, Kamin Whitehouse
However, very few homes in the world have installed sub-meters (sensors measuring individual appliance energy); and the cost of retrofitting a home with extensive sub-metering eats into the funds available for energy saving retrofits.
no code implementations • IJCNLP 2019 • Huazheng Wang, Zhe Gan, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Hongning Wang
In this paper, we focus on unsupervised domain adaptation for Machine Reading Comprehension (MRC), where the source domain has a large amount of labeled data, while only unlabeled passages are available in the target domain.
no code implementations • 10 Jun 2019 • Huazheng Wang, Sonwoo Kim, Eric McCord-Snook, Qingyun Wu, Hongning Wang
We prove that the projected gradient is an unbiased estimation of the true gradient, and show that this lower-variance gradient estimation results in significant regret reduction.
1 code implementation • 9 Jun 2019 • Qingyun Wu, Zhige Li, Huazheng Wang, Wei Chen, Hongning Wang
We capitalize on an important property of the influence maximization problem named network assortativity, which is ignored by most existing works in online influence maximization.
5 code implementations • 5 Jun 2019 • Wasi Uddin Ahmad, Kai-Wei Chang, Hongning Wang
We present a context-aware neural ranking model to exploit users' on-task search activities and enhance retrieval performance.
no code implementations • 3 Jun 2019 • Yiyi Tao, Yiling Jia, Nan Wang, Hongning Wang
In this work, we integrate regression trees to guide the learning of latent factor models for recommendation, and use the learnt tree structure to explain the resulting latent factors.
1 code implementation • NeurIPS 2018 • Yi Qi, Qingyun Wu, Hongning Wang, Jie Tang, Maosong Sun
Implicit feedback, such as user clicks, although abundant in online information service systems, does not provide substantial evidence on users' evaluation of system's output.
1 code implementation • 10 Jun 2018 • Nan Wang, Hongning Wang, Yiling Jia, Yue Yin
Explaining automatically generated recommendations allows users to make more informed and accurate decisions about which results to utilize, and therefore improves their satisfaction.
1 code implementation • 23 May 2018 • Qingyun Wu, Naveen Iyer, Hongning Wang
Multi-armed bandit algorithms have become a reference solution for handling the explore/exploit dilemma in recommender systems, and many other important real-world problems, such as display advertisement.
no code implementations • 18 May 2018 • Huazheng Wang, Ramsey Langley, Sonwoo Kim, Eric McCord-Snook, Hongning Wang
In this paper, we accelerate the online learning process by efficient exploration in the gradient space.
1 code implementation • ICLR 2018 • Wasi Uddin Ahmad, Kai-Wei Chang, Hongning Wang
We propose a multi-task learning framework to jointly learn document ranking and query suggestion for web search.