1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).
1 code implementation • 31 Mar 2024 • Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, René Vidal, Maheswaran Sathiamoorthy, Atoosa Kasirzadeh, Silvia Milano
Traditional recommender systems (RS) have used user-item rating histories as their primary data source, with collaborative filtering being one of the principal methods.
no code implementations • 9 Feb 2024 • Andrew Smart, Ding Wang, Ellis Monk, Mark Díaz, Atoosa Kasirzadeh, Erin Van Liemt, Sonja Schmer-Galunder
Data annotation remains the sine qua non of machine learning and AI.
no code implementations • 15 Jan 2024 • Atoosa Kasirzadeh
This involves a gradual accumulation of critical AI-induced threats such as severe vulnerabilities and systemic erosion of econopolitical structures.
no code implementations • 21 Apr 2023 • Atoosa Kasirzadeh
The allure of emerging AI technologies is undoubtedly thrilling.
no code implementations • 1 Sep 2022 • Atoosa Kasirzadeh, Iason Gabriel
Furthermore, we explore how these norms can be used to align conversational agents with human values across a range of different discursive domains.
no code implementations • 8 Dec 2021 • Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel
We discuss the points of origin of different risks and point to potential mitigation approaches.
no code implementations • 9 Sep 2021 • Charles Evans, Atoosa Kasirzadeh
In this paper, we introduce new formal methods and provide empirical evidence to highlight a unique safety concern prevalent in reinforcement learning (RL)-based recommendation algorithms -- 'user tampering.'
no code implementations • 1 Mar 2021 • Atoosa Kasirzadeh
The societal and ethical implications of the use of opaque artificial intelligence systems for consequential decisions, such as welfare allocation and criminal justice, have generated a lively debate among multiple stakeholder groups, including computer scientists, ethicists, social scientists, policy makers, and end users.
no code implementations • 30 Oct 2019 • Atoosa Kasirzadeh
In particular, I offer a multi-faceted conceptual framework for the explanations and the interpretations of algorithmic decisions, and I claim that this framework can lay the groundwork for a focused discussion among multiple stakeholders about the social implications of algorithmic decision-making, as well as AI governance and ethics more generally.