Search Results for author: Malek Mechergui

Found 2 papers, 0 papers with code

Handling Reward Misspecification in the Presence of Expectation Mismatch

no code implementations • 12 Apr 2024 • Sarath Sreedharan, Malek Mechergui

Detecting and handling misspecified objectives, such as reward functions, has been widely recognized as one of the central challenges within the domain of Artificial Intelligence (AI) safety research.

AI Agent

Paper
Add Code

Goal Alignment: A Human-Aware Account of Value Alignment Problem

no code implementations • 2 Feb 2023 • Malek Mechergui, Sarath Sreedharan

To address this lacuna, we propose a novel formulation for the value alignment problem, named goal alignment that focuses on a few central challenges related to value alignment.

AI Agent

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.