no code implementations • 23 Feb 2024 • Joyce Zhou, Yijia Dai, Thorsten Joachims
The encoder LLM generates a compact natural-language profile of the user's interests from the user's rating history.
no code implementations • 1 Dec 2023 • Viraj Mehta, Vikramjeet Das, Ojash Neopane, Yijia Dai, Ilija Bogunovic, Jeff Schneider, Willie Neiswanger
Preference-based feedback is important for many applications in reinforcement learning where direct evaluation of a reward function is not feasible.
no code implementations • 10 Sep 2023 • Yijia Dai, Wen Sun
Reinforcement learning (RL) in recommendation systems offers the potential to optimize recommendations for long-term user engagement.