Paper
International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management
This is the Second Key Update to the 2025 International AI Safety Report. The First Key Update (1) discussed developments in the capabilities of general-purpose AI models and systems and associated risks. This Key Update covers how various actors, including researchers, companies, and governments, are approaching risk management and technical mitigations for AI. The past year has seen important developments in AI risk management, including better techniques for training safer models and monitoring their outputs. While this represents tangible progress, significant gaps remain. It is often uncertain how effective current measures are at preventing harms, and effectiveness varies across time and applications. There are many opportunities to further strengthen existing safeguard techniques and to develop new ones. This Key Update provides a concise overview of critical developments in risk management practices and technical risk mitigation since the publication of the 2025 AI Safety Report in January. It highlights where progress is being made and where gaps remain. Above all, it aims to support policymakers, researchers, and the public in navigating a rapidly changing environment, helping them to make informed and timely decisions about the governance of general-purpose AI. Professor Yoshua BengioUniversité de Montréal / LawZero /Mila – Quebec AI Institute & Chair
Authors: Yoshua Bengio · Stephen Clare · Carina Prunkl · Maksym Andriushchenko · Benjamin Bucknall · Philip Fox · Nestor Maslej · Conor McGlynn · Malcolm Murray · Shalaleh Rismani · Stephen Casper · Jessica Newman · Daniel Privitera · Sören Mindermann · Daron Acemoglu · Thomas G. Dietterich · Fredrik Heintz · Geoffrey Hinton · Nick Jennings · Susan Leavy · Teresa Ludermir · Vidushi Marda · Helen Margetts · John McDermid · Jane Munga · Arvind Narayanan · Alondra Nelson · Clara Neppel · Sarvapali D. (Gopal) Ramchurn · Stuart Russell · Marietje Schaake · Bernhard Schölkopf · Alvaro Soto · Lee Tiedrich · Gaël Varoquaux · Andrew Yao · Ya-Qin Zhang