Search Results for author: Gabriel Mukobi

Found 6 papers, 4 papers with code

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

no code implementations6 Jun 2024 Rylan Schaeffer, Hailey Schoelkopf, Brando Miranda, Gabriel Mukobi, Varun Madan, Adam Ibrahim, Herbie Bradley, Stella Biderman, Sanmi Koyejo

We then reveal the mechanism causing this degradation: downstream metrics require comparing the correct choice against a small number of specific incorrect choices, meaning accurately predicting downstream capabilities requires predicting not just how probability mass concentrates on the correct choice with scale, but also how probability mass fluctuates on specific incorrect choices with scale.

Multiple-choice Question Answering

Societal Adaptation to Advanced AI

no code implementations16 May 2024 Jamie Bernardi, Gabriel Mukobi, Hilary Greaves, Lennart Heim, Markus Anderljung

Existing strategies for managing risks from advanced AI systems often focus on affecting what AI systems are developed and how they diffuse.

Escalation Risks from Language Models in Military and Diplomatic Decision-Making

1 code implementation7 Jan 2024 Juan-Pablo Rivera, Gabriel Mukobi, Anka Reuel, Max Lamparth, Chandler Smith, Jacquelyn Schneider

Governments are increasingly considering integrating autonomous AI agents in high-stakes military and foreign-policy decision-making, especially with the emergence of advanced generative AI models like GPT-4.

Decision Making Language Modelling

SuperHF: Supervised Iterative Learning from Human Feedback

1 code implementation25 Oct 2023 Gabriel Mukobi, Peter Chatain, Su Fong, Robert Windesheim, Gitta Kutyniok, Kush Bhatia, Silas Alberti

Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).

Language Modelling

Welfare Diplomacy: Benchmarking Language Model Cooperation

1 code implementation13 Oct 2023 Gabriel Mukobi, Hannah Erlebach, Niklas Lauffer, Lewis Hammond, Alan Chan, Jesse Clifton

The growing capabilities and increasingly widespread deployment of AI systems necessitate robust benchmarks for measuring their cooperative capabilities.

Benchmarking Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.