Paul Christiano is head of AI safety for the U.S. Artificial Intelligence Safety Institute. In this role, he will design and conduct tests of frontier AI models, focusing on model evaluations for capabilities of national security concern. Christiano will also contribute guidance on conducting these evaluations, as well as on the implementation of risk mitigations to enhance frontier model safety and security.
Christiano founded the Alignment Research Center, a nonprofit research organization that seeks to align future machine learning systems with human interests by furthering theoretical research. He also launched a leading initiative to conduct third-party evaluations of frontier models, now housed at Model Evaluation and Threat Research (METR).
He previously ran the language model alignment team at OpenAI, where he pioneered work on reinforcement learning from human feedback (RLHF), a foundational technical AI safety technique.
He holds a Ph.D. in computer science from the University of California, Berkeley, and a B.S. in mathematics from the Massachusetts Institute of Technology.