no code implementations • 6 Sep 2023 • Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang
We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation.
no code implementations • 25 Aug 2023 • Tianchi Cai, Shenliao Bao, Jiyan Jiang, Shiji Zhou, Wenpeng Zhang, Lihong Gu, Jinjie Gu, Guannan Zhang
Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards.
1 code implementation • 7 Apr 2023 • Weijie Li, Wei Yang, Wenpeng Zhang, Tianpeng Liu, Yongxiang Liu, Li Liu
However, robustly recognizing vehicle targets is a challenging task in SAR due to the large intraclass variations and small interclass variations.
2 code implementations • 3 Apr 2023 • Weijie Li, Wei Yang, Li Liu, Wenpeng Zhang, Yongxiang Liu
Therefore, the degree of overfitting for clutter reflects the non-causality of deep learning in SAR ATR.
no code implementations • NeurIPS 2021 • Jiyan Jiang, Wenpeng Zhang, Jinjie Gu, Wenwu Zhu
To overcome this problem, we study decentralized online learning in the asynchronous setting, which allows different learners to work at their own pace.
no code implementations • 29 Sep 2021 • Lianzhe Wang, Shiji Zhou, Shanghang Zhang, Wenpeng Zhang, Heng Chang, Wenwu Zhu
Even though meta-learning has attracted research wide attention in recent years, the generalization problem of meta-learning is still not well addressed.
1 code implementation • ICLR 2022 • Pengcheng Yang, XiaoMing Zhang, Wenpeng Zhang, Ming Yang, Hong Wei
The recent trend of using large-scale deep neural networks (DNN) to boost performance has propelled the development of the parallel pipelining technique for efficient DNN training, which has resulted in the development of several prominent pipelines such as GPipe, PipeDream, and PipeDream-2BW.
no code implementations • 29 Sep 2021 • Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Lihong Gu, Xiaodong Zeng, Wenwu Zhu
This paper presents a systematic study of multi-objective online learning.
no code implementations • 29 Aug 2021 • Tianchi Cai, Wenpeng Zhang, Lihong Gu, Xiaodong Zeng, Jinjie Gu
To apply value-based methods to CRL, a recent groundbreaking line of game-theoretic approaches uses the mixed policy that randomizes among a set of carefully generated policies to converge to the desired constraint-satisfying policy.
no code implementations • 5 Feb 2018 • Wenpeng Zhang, Xiao Lin, Peilin Zhao
To address this subsequent challenge, we follow the general projection-free algorithmic framework of Online Conditional Gradient and propose an Online Compact Convex Factorization Machine (OCCFM) algorithm that eschews the projection operation with efficient linear optimization steps.
no code implementations • ICML 2017 • Wenpeng Zhang, Peilin Zhao, Wenwu Zhu, Steven C. H. Hoi, Tong Zhang
The conditional gradient algorithm has regained a surge of research interest in recent years due to its high efficiency in handling large-scale machine learning problems.