no code implementations • 21 Apr 2024 • Narek Maloyan, Ekansh Verma, Bulat Nutfullin, Bislan Ashinov
The phenomenon of unintended triggers and the difficulty in distinguishing them from intended triggers highlights the need for further research into the robustness and interpretability of LLMs.