Safeguarded AI

Backed by £59m, this programme aims to develop the safety standards we need for transformational AI

Why this programme

As AI becomes more capable, it has the potential to power scientific breakthroughs, enhance global prosperity, and safeguard us from disasters. But only if it’s deployed wisely.

Current techniques working to mitigate the risk of advanced AI systems have serious limitations, and can’t be relied upon empirically to ensure safety. To date, very little R&D effort has gone into approaches that provide quantitative safety guarantees for AI systems, because they’re considered impossible or impractical.

What we’re shooting for

By combining scientific world models and mathematical proofs we will aim to construct a ‘gatekeeper’, an AI system tasked with understanding and reducing the risks of other AI agents.

In doing so we’ll develop quantitative safety guarantees for AI in the way we have come to expect for nuclear power and passenger aviation.

Our goal: to usher in a new era for AI safety, allowing us to unlock the full economic and social benefits of advanced AI systems while minimising risks.

This programme is split into three technical areas (TAs), each with its own distinct solicitations.

Apply for funding: TA1.4

Deadline: 2 January 2025 [12:00 GMT]

The third solicitation for this programme is focused on TA1.4 Sociotechnical Integration. Backed by £3.4m, we’re looking to support teams from the economic, social, legal and political sciences to consider the sound socio-technical integration of Safeguarded AI systems.  

This solicitation seeks R&D Creators – individuals and teams that ARIA will fund – to work on problems that are plausibly critical to ensuring that the technologies developed a part of the programme will be used in the best interest of humanity at large, and that they are designed in a way that enables their governability through representative processes of collective deliberation and decision-making. 

A few examples of the open problems we’re looking for people to work on:

  • Qualitative deliberation facilitation: What tools or processes best enable representative input, collective deliberation and decision-making about safety specifications, acceptable risk thresholds, or success conditions for a given application domain? We hope to integrate these into the Safeguarded AI scaffolding.
  • Quantitative bargaining solutions: What social choice mechanisms or quantitative bargaining solutions could best navigate irreconcilable differences in stakeholders’ goals, risk tolerances, and preferences, in order for Safeguarded AI systems to serve a multi-stakeholder notion of public good?
  • Governability tools for society: How can we ensure that Safeguarded AI systems are governed in societally beneficial and legitimate ways?
  • Governability tools for organisations: Organisations developing Safeguarded AI capabilities have the potential to create significant externalities – both risks and benefits. What set of decision-making and governance mechanisms are best to ensure that entities developing or deploying Safeguarded AI capabilities have and maintain these externalities as appropriately major factors in their decision-making?

We are also open to applications proposing other lines of work which illuminate critical socio-technical dimensions of Safeguarded AI systems, if they propose solutions to increase assurance that these systems will reliably be developed and deployed in service of humanity at large.

Previous funding calls within Safeguarded AI

TA1.1: Theory

The first solicitation for this programme focused on TA1.1 Theory, where we sought R&D Creators – individuals and teams that ARIA will fund and support – to research and construct computationally practicable mathematical representations and formal semantics to support world-models, specifications about state-trajectories, neural systems, proofs that neural outputs validate specifications, and “version control” (incremental updates or “patches”) thereof.

Download the funding call for TA1.1 [PDF]

Watch the solicitation presentation

TA3: Applications

The second funding call sought potential individuals or organisations interested in using our gatekeeper AI to build safeguarded products for domain-specific applications, such as optimising energy networks, clinical trials, or telecommunications networks. Safeguarded AI’s success will depend on showing that our gatekeeper AI actually works in a safety-critical domain. The research teams selected for TA3 will work with other programme teams, global AI experts, academics, and entrepreneurs, in setting the groundwork to deploy Safeguarded AI in one or more areas.

In this first phase of TA3 funding, we intend to allocate an initial £5.4M aimed at eliciting requirements, sourcing datasets, and establishing evaluation benchmarks for relevant cyber-physical domains.

Applications are now closed. Successful/unsuccessful applicants will be notified on 18 November 2024. 

Download the funding call for TA3 [PDF]

Watch the solicitation presentation

TA2: Expression of Interest

We’re looking for expressions of interest from individuals or organisations to get involved in the development of an organisation spearheading the research + engineering comprising Technical Area 2.

Meet davidad

Safeguarded AI has been designed and overseen by Programme Director David ‘davidad’ Dalrymple with feedback from the R&D community, as part of the opportunity space Mathematics for Safe AI.

davidad is a software engineer with a multidisciplinary scientific background. He’s spent five years formulating a vision for how mathematical approaches could guarantee reliable and trustworthy AI. Before joining ARIA, davidad co-invented the top-40 cryptocurrency Filecoin and worked as a Senior Software Engineer at Twitter.

background image

We’re excited to welcome Professor Yoshua Bengio as Scientific Director for Safeguarded AI, supporting the work led by Programme Director, davidad. A world-renowned computer scientist and a pioneer in deep learning, Yoshua was awarded the 2018 Turing Award, often referred to as the “Nobel Prize of Computing,” for his groundbreaking work in AI.

Find out more