1 code implementation • 3 Feb 2023 • Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai
One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning.
no code implementations • 27 Jan 2023 • Wanqi Xue, Bo An, Shuicheng Yan, Zhongwen Xu
The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques.
1 code implementation • 6 Dec 2022 • Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An
Though promising, the application of RL heavily relies on well-designed rewards, but designing rewards related to long-term user engagement is quite difficult.
1 code implementation • 1 Jun 2022 • Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An
Meanwhile, reinforcement learning (RL) is widely regarded as a promising framework for optimizing long-term engagement in sequential recommendation.
no code implementations • 17 Jan 2022 • Wanqi Xue, Bo An, Chai Kiat Yeo
Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources.
no code implementations • 15 Dec 2021 • Shuo Sun, Wanqi Xue, Rundong Wang, Xu He, Junlei Zhu, Jian Li, Bo An
Reinforcement learning (RL) techniques have shown great success in many challenging quantitative trading tasks, such as portfolio management and algorithmic trading.
no code implementations • 9 Aug 2021 • Wanqi Xue, Wei Qiu, Bo An, Zinovi Rabinovich, Svetlana Obraztsova, Chai Kiat Yeo
Empirical results demonstrate that many state-of-the-art MACRL methods are vulnerable to message attacks, and our method can significantly improve their robustness.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 2 Jun 2021 • Wanqi Xue, Youzhi Zhang, Shuxin Li, Xinrun Wang, Bo An, Chai Kiat Yeo
Securing networked infrastructures is important in the real world.
no code implementations • 18 May 2021 • Shuxin Li, Youzhi Zhang, Xinrun Wang, Wanqi Xue, Bo An
The challenge of solving this type of game is that the team's joint action space grows exponentially with the number of agents, which results in the inefficiency of the existing algorithms, e. g., Counterfactual Regret Minimization (CFR).
no code implementations • 4 May 2020 • Wanqi Xue, Wei Wang
In this paper, we adopt metric learning for this problem, which has been applied for few- and many-shot image classification by comparing the distance between the test image and the center of each class in the feature space.