
109. Danijar Hafner - Gaming our way to AGI
Towards Data Science
01/12/22
•50m
About
Comments
Featured In
Until recently, AI systems have been narrow — they’ve only been able to perform the specific tasks that they were explicitly trained for. And while narrow systems are clearly useful, the holy grain of AI is to build more flexible, general systems.
But that can’t be done without good performance metrics that we can optimize for — or that we can at least use to measure generalization ability. Somehow, we need to figure out what number needs to go up in order to bring us closer to generally-capable agents. That’s the question we’ll be exploring on this episode of the podcast, with Danijar Hafner. Danijar is a PhD student in artificial intelligence at the University of Toronto with Jimmy Ba and Geoffrey Hinton and researcher at Google Brain and the Vector Institute.
Danijar has been studying the problem of performance measurement and benchmarking for RL agents with generalization abilities. As part of that work, he recently released Crafter, a tool that can procedurally generate complex environments that are a lot like Minecraft, featuring resources that need to be collected, tools that can be developed, and enemies who need to be avoided or defeated. In order to succeed in a Crafter environment, agents need to robustly plan, explore and test different strategies, which allow them to unlock certain in-game achievements.
Crafter is part of a growing set of strategies that researchers are exploring to figure out how we can benchmark and measure the performance of general-purpose AIs, and it also tells us something interesting about the state of AI: increasingly, our ability to define tasks that require the right kind of generalization abilities is becoming just as important as innovating on AI model architectures. Danijar joined me to talk about Crafter, reinforcement learning, and the big challenges facing AI researchers as they work towards general intelligence on this episode of the TDS podcast.
***
Intro music:
Artist: Ron Gelinas
Track Title: Daybreak Chill Blend (original mix)
Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:- 0:00 Intro
- 2:25 Measuring generalization
- 5:40 What is Crafter?
- 11:10 Differences between Crafter and Minecraft
- 20:10 Agent behavior
- 25:30 Merging scaled models and reinforcement learning
- 29:30 Data efficiency
- 38:00 Hierarchical learning
- 43:20 Human-level systems
- 48:40 Cultural overlap
- 49:50 Wrap-up
Previous Episode

108. Last Week In AI — 2021: The (full) year in review
January 5, 2022
•50m
2021 has been a wild ride in many ways, but its wildest features might actually be AI-related. We’ve seen major advances in everything from language modeling to multi-modal learning, open-ended learning and even AI alignment.
So, we thought, what better way to take stock of the big AI-related milestones we’ve reached in 2021 than a cross-over episode with our friends over at the Last Week In AI podcast.
***
Intro music:
Artist: Ron Gelinas
Track Title: Daybreak Chill Blend (original mix)
Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 0:00 Intro
- 2:15 Rise of multi-modal models
- 7:40 Growth of hardware and compute
- 13:20 Reinforcement learning
- 20:45 Open-ended learning
- 26:15 Power seeking paper
- 32:30 Safety and assumptions
- 35:20 Intrinsic vs. extrinsic motivation
- 42:00 Mapping natural language
- 46:20 Timnit Gebru’s research institute
- 49:20 Wrap-up
Next Episode

110. Alex Turner - Will powerful AIs tend to seek power?
January 19, 2022
•46m
Today’s episode is somewhat special, because we’re going to be talking about what might be the first solid quantitative study of the power-seeking tendencies that we can expect advanced AI systems to have in the future.
For a long time, there’s kind of been this debate in the AI safety world, between:
- People who worry that powerful AIs could eventually displace, or even eliminate humanity altogether as they find more clever, creative and dangerous ways to optimize their reward metrics on the one hand, and
- People who say that’s Terminator-bating Hollywood nonsense that anthropomorphizes machines in a way that’s unhelpful and misleading.
Unfortunately, recent work in AI alignment — and in particular, a spotlighted 2021 NeurIPS paper — suggests that the AI takeover argument might be stronger than many had realized. In fact, it’s starting to look like we ought to expect to see power-seeking behaviours from highly capable AI systems by default. These behaviours include things like AI systems preventing us from shutting them down, repurposing resources in pathological ways to serve their objectives, and even in the limit, generating catastrophes that would put humanity at risk.
As concerning as these possibilities might be, it’s exciting that we’re starting to develop a more robust and quantitative language to describe AI failures and power-seeking. That’s why I was so excited to sit down with AI researcher Alex Turner, the author of the spotlighted NeurIPS paper on power-seeking, and discuss his path into AI safety, his research agenda and his perspective on the future of AI on this episode of the TDS podcast.
***
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
2:05 Interest in alignment research
8:00 Two camps of alignment research
13:10 The NeurIPS paper
17:10 Optimal policies
25:00 Two-piece argument
28:30 Relaxing certain assumptions
32:45 Objections to the paper
39:00 Broader sense of optimization
46:35 Wrap-up
If you like this episode you’ll love
Promoted




