Search Results for author: Gil Shamir

Found 3 papers, 0 papers with code

Offline Regularised Reinforcement Learning for Large Language Models Alignment

no code implementations • 29 May 2024 • Pierre Harvey Richemond, Yunhao Tang, Daniel Guo, Daniele Calandriello, Mohammad Gheshlaghi Azar, Rafael Rafailov, Bernardo Avila Pires, Eugene Tarassov, Lucas Spangher, Will Ellsworth, Aliaksei Severyn, Jonathan Mallinson, Lior Shani, Gil Shamir, Rishabh Joshi, Tianqi Liu, Remi Munos, Bilal Piot

The canonical element of such datasets is for instance an LLM's response to a user's prompt followed by a user's feedback such as a thumbs-up/down.

Paper
Add Code

Learning to Rank when Grades Matter

no code implementations • 14 Jun 2023 • Le Yan, Zhen Qin, Gil Shamir, Dong Lin, Xuanhui Wang, Mike Bendersky

In this paper, we conduct a rigorous study of learning to rank with grades, where both ranking performance and grade prediction performance are important.

Learning-To-Rank

Paper
Add Code

Dropout Prediction Uncertainty Estimation Using Neuron Activation Strength

no code implementations • 13 Oct 2021 • Haichao Yu, Zhe Chen, Dong Lin, Gil Shamir, Jie Han

Dropout has been commonly used to quantify prediction uncertainty, i. e, the variations of model predictions on a given input example.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.