Contribute Media
A thank you to everyone who makes this possible: Read More

Towards Causal Foundations of Safe AI

Description

"Towards Causal Foundations of Safe AI" James Fox, Tom Everitt

With great power comes great responsibility. Artificial intelligence (AI) is rapidly gaining new capabilities, and is increasingly trusted to make decisions impacting humans in significant ways (from self-driving cars to stock-trading to hiring decisions). To ensure that AI behaves in ethical and robustly beneficial ways, we must identify potential pitfalls and develop effective mitigation strategies. In this tutorial, we will explain how (Pearlian) causality offers a useful formal framework for reasoning about AI risk and describe recent work on this topic. In particular, we’ll cover: causal models of agents and how to discover them; causal definitions of fairness, intent, harm, and incentives; and risks from AI such as misgeneralization and preference manipulation, as well as how mitigation techniques including impact measures, interpretability, and path-specific objectives can help address them.

Details

Improve this page