Paper

International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management

This is the Second Key Update to the 2025 International AI Safety Report. The First Key Update (1) discussed developments in the capabilities of general-purpose AI models and systems and associated risks. This Key Update covers how various actors, including researchers, companies, and governments, are approaching risk management and technical mitigations for AI. The past year has seen important developments in AI risk management, including better techniques for training safer models and monitoring their outputs. While this represents tangible progress, significant gaps remain. It is often uncertain how effective current measures are at preventing harms, and effectiveness varies across time and applications. There are many opportunities to further strengthen existing safeguard techniques and to develop new ones. This Key Update provides a concise overview of critical developments in risk management practices and technical risk mitigation since the publication of the 2025 AI Safety Report in January. It highlights where progress is being made and where gaps remain. Above all, it aims to support policymakers, researchers, and the public in navigating a rapidly changing environment, helping them to make informed and timely decisions about the governance of general-purpose AI. Professor Yoshua BengioUniversité de Montréal / LawZero /Mila – Quebec AI Institute & Chair

SuperIntelligence - Robotics - Safety & AlignmentPublished 2025-12-07Paper link

Authors: Yoshua Bengio · Stephen Clare · Carina Prunkl · Maksym Andriushchenko · Benjamin Bucknall · Philip Fox · Nestor Maslej · Conor McGlynn · Malcolm Murray · Shalaleh Rismani · Stephen Casper · Jessica Newman · Daniel Privitera · Sören Mindermann · Daron Acemoglu · Thomas G. Dietterich · Fredrik Heintz · Geoffrey Hinton · Nick Jennings · Susan Leavy · Teresa Ludermir · Vidushi Marda · Helen Margetts · John McDermid · Jane Munga · Arvind Narayanan · Alondra Nelson · Clara Neppel · Sarvapali D. (Gopal) Ramchurn · Stuart Russell · Marietje Schaake · Bernhard Schölkopf · Alvaro Soto · Lee Tiedrich · Gaël Varoquaux · Andrew Yao · Ya-Qin Zhang

Topics

Regulation Safety

Relevant entities

People

openalex-author

Geoffrey E. Hinton

Computer Scientist

openalex-author

Yoshua Bengio

Computer Scientist

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.