David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | AI Podcast #86 with Lex Fridman

April 3, 2020Artis Modus

Lex Fridman

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. This conversation is part of the Artificial Intelligence podcast.

Support this podcast by signing up with these sponsors:
– MasterClass: https://masterclass.com/lex
– Cash App – use code “LexPodcast” and download:
– Cash App (App Store): https://apple.co/2sPrUHe
– Cash App (Google Play): https://bit.ly/2MlvP5w

EPISODE LINKS:
Reinforcement learning (book): https://amzn.to/2Jwp5zG

INFO:
Podcast website:
https://lexfridman.com/ai
Apple Podcasts:
https://apple.co/2lwqZIr
Spotify:
https://spoti.fi/2nEwCF8
RSS:
https://lexfridman.com/category/ai/feed/
Full episodes playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41

OUTLINE:
0:00 – Introduction
4:09 – First program
11:11 – AlphaGo
21:42 – Rule of the game of Go
25:37 – Reinforcement learning: personal journey
30:15 – What is reinforcement learning?
43:51 – AlphaGo (continued)
53:40 – Supervised learning and self play in AlphaGo
1:06:12 – Lee Sedol retirement from Go play
1:08:57 – Garry Kasparov
1:14:10 – Alpha Zero and self play
1:31:29 – Creativity in AlphaZero
1:35:21 – AlphaZero applications
1:37:59 – Reward functions
1:40:51 – Meaning of life

CONNECT:
– Subscribe to this YouTube channel
– Twitter: https://twitter.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman
– Medium: https://medium.com/@lexfridman
– Support on Patreon: https://www.patreon.com/lexfridman

Source

Similar Posts

25 thoughts on “David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | AI Podcast #86 with Lex Fridman”

Lex Fridman says:

April 3, 2020 at 1:52 pm

I really enjoyed this conversation with David. Here's the outline:
0:00 – Introduction
4:09 – First program
11:11 – AlphaGo
21:42 – Rule of the game of Go
25:37 – Reinforcement learning: personal journey
30:15 – What is reinforcement learning?
43:51 – AlphaGo (continued)
53:40 – Supervised learning and self play in AlphaGo
1:06:12 – Lee Sedol retirement from Go play
1:08:57 – Garry Kasparov
1:14:10 – Alpha Zero and self play
1:31:29 – Creativity in AlphaZero
1:35:21 – AlphaZero applications
1:37:59 – Reward functions
1:40:51 – Meaning of life
opmike343 says:

May 3, 2020 at 5:05 pm

We are seeing the tip of the spear here, folks. Be vigilant.
Iwan Jones says:

May 5, 2020 at 7:28 am

I am struck by how small the audience is for this astonishing talk. It is so important that it should number in the millions, even billions.
The AI Epiphany says:

May 5, 2020 at 11:45 am

I dare say this was THE most interesting episode so far! Deep learning is solving perception big time but it seems to me that (deep) RL will solve the cognition part of the equation.
Perer Addison says:

May 5, 2020 at 4:36 pm

… 11:11 alfa go… THERE IS NO ARTIFICIAL INTELLIGENCE…
It either IS, or IT, is NOT, INTELLIGENT. The pursuit for A.I. is a static-flux, and the ultimate example and final proof of human foolishness and folly. AND, like CERN, the expenditure and squanderance of energy ALONE is exquisite, and will culminate in the creation of a NON- ENTITY that will ONLY prove the existence of GOD …
Maximiliano Contreras says:

May 5, 2020 at 8:58 pm

Miguel Nicolelis: https://www.google.com/search?client=firefox-b-e&q=most+important+neuroscientist
Jonathan Cui says:

May 6, 2020 at 4:06 am

Could quantum computers possibly make a better exploration of the game tree (in superpositions) and hence become a better AI?
Alexander Mackey says:

May 6, 2020 at 5:04 am

Sphongle goes well with these podcasts.
Gloom Berry says:

May 6, 2020 at 6:34 am

do you sleep in tuxedo?
Ken Brunet says:

May 6, 2020 at 7:54 am

Super interesting. The concepts he's talking about and tackling are fascinating. However if you can pull back and look at these ideas from a big picture perspective. It's pretty scary to me all the implications and consequences of building systems that will mimic human intelligence but will have many times the processing power… Humans are a scary species as it is. Feels like they're going to build much scarier versions of humans without eliminating the monumental flaws we inherently have.
Isak Rathe Støre says:

May 6, 2020 at 11:27 am

Awesome interview. I start jumping around with excitement. Get so eager to learn more!
StatsGod says:

May 7, 2020 at 5:05 am

How can he be sure AlphaGo was having "moments of delusion" and not using insanely creative strategies that humans couldn't grasp…
Pookah says:

May 7, 2020 at 4:21 pm

So, Lex, can you in AI apply the concept of sortition with people to re organize the political landscape of the U.S.?I'm referring to your conversations with Eric Weinstein about running for president…we don't need a president we need …
Y H says:

May 7, 2020 at 11:36 pm

1:03:50 AlphaGo vs Lee Sedol was just a glorifed fuzz testing session.
Internet User says:

May 10, 2020 at 1:20 am

thank you for this interview
Philippe Larcher says:

May 10, 2020 at 4:55 am

1:34:00 humans learned new, more general maximas, by looking at the AI playing.
Why not try this in martial arts? :p (kind of the science based martial art in the movie Equilibrium)

All martial arts were developed incrementally, and it's hard to experiment new global maximas without getting beaten by fighters well trained in the common way.

I've seen connected ideas (without the AI part of course) in several places:
– Joe Rogan about the way to use taekwondo in MMA
– the mode of practice of Systema
– the mode of Practice of Shinbukan
Dizbee FPV Dizbelief Dizzy says:

May 11, 2020 at 12:14 pm

Very enlightening thanks.
Rahul Sagar says:

May 13, 2020 at 11:11 am

Hey man, awesome interviews! You seems to be a really good person. Thank you for what you are doing.
Eisenwerks says:

May 14, 2020 at 1:33 pm

Harold Finch as a younger man, before building the Machine
paul clayton says:

May 15, 2020 at 11:33 am

An idiot savant. This tech will facilitate the singularity, but before we get there it will be misused by politicians to enable the death of privacy, democracy and human freedom on a scale that Silver is unable to understand.
Mrbigolnuts says:

May 16, 2020 at 1:54 am

and here I was thinking 'self play' is what I do in bed every night, boy was I wrong!!
Sergey Polishchuk says:

May 16, 2020 at 4:01 pm

Can this Alpha Zero figure out how to deal with coronavirus?
Sriram Ramanathan says:

May 17, 2020 at 5:01 am

Answer to 1:40:51 – Meaning of life is deep and profound with many layers!
William Korcari says:

May 23, 2020 at 6:00 am

This talk is so inspiring.
David K. says:

May 23, 2020 at 3:03 pm

I would be interested how you can use reinforced learning for real world problems without "playing through" millions of variations and where not all significant parameters are known and therefore the outcome is not 100% replicable. So the atari game is not different than the go game I think.

I also wouldn't say finding a new placement of a go stone is "creativity" … as the goal was to beat humans, obviously there was a high likelyhood it will find new moves that humans hadn't considered much before. In my opinion creativity is not when a comupter finds an optimal solutions to a given, specific game. Creativity is when a computer starts to create a new game with new rules without being told to do it.

Comments are closed.