Please login or sign up to post and edit reviews.
87. Evan Hubinger - The Inner Alignment Problem
Publisher |
The TDS team
Media Type |
audio
Categories Via RSS |
Technology
Publication Date |
Jun 09, 2021
Episode Duration |
01:09:32

How can you know that a super-intelligent AI is trying to do what you asked it to do?

The answer, it turns out, is: not easily. And unfortunately, an increasing number of AI safety researchers are warning that this is a problem we’re going to have to solve sooner rather than later, if we want to avoid bad outcomes — which may include a species-level catastrophe.

The type of failure mode whereby AIs optimize for things other than those we ask them to is known as an inner alignment failure in the context of AI safety. It’s distinct from outer alignment failure, which is what happens when you ask your AI to do something that turns out to be dangerous, and it was only recognized by AI safety researchers as its own category of risk in 2019. And the researcher who led that effort is my guest for this episode of the podcast, Evan Hubinger.

Evan is an AI safety veteran who’s done research at leading AI labs like OpenAI, and whose experience also includes stints at Google, Ripple and Yelp. He currently works at the Machine Intelligence Research Institute (MIRI) as a Research Fellow, and joined me to talk about his views on AI safety, the alignment problem, and whether humanity is likely to survive the advent of superintelligent AI.

This episode currently has no reviews.

Submit Review
This episode could use a review!

This episode could use a review! Have anything to say about it? Share your thoughts using the button below.

Submit Review