Lex Fridman
David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. This conversation is part of the Artificial Intelligence podcast.
Support this podcast by signing up with these sponsors:
– MasterClass: https://masterclass.com/lex
– Cash App – use code “LexPodcast” and download:
– Cash App (App Store): https://apple.co/2sPrUHe
– Cash App (Google Play): https://bit.ly/2MlvP5w
EPISODE LINKS:
Reinforcement learning (book): https://amzn.to/2Jwp5zG
INFO:
Podcast website:
https://lexfridman.com/ai
Apple Podcasts:
https://apple.co/2lwqZIr
Spotify:
https://spoti.fi/2nEwCF8
RSS:
https://lexfridman.com/category/ai/feed/
Full episodes playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41
OUTLINE:
0:00 – Introduction
4:09 – First program
11:11 – AlphaGo
21:42 – Rule of the game of Go
25:37 – Reinforcement learning: personal journey
30:15 – What is reinforcement learning?
43:51 – AlphaGo (continued)
53:40 – Supervised learning and self play in AlphaGo
1:06:12 – Lee Sedol retirement from Go play
1:08:57 – Garry Kasparov
1:14:10 – Alpha Zero and self play
1:31:29 – Creativity in AlphaZero
1:35:21 – AlphaZero applications
1:37:59 – Reward functions
1:40:51 – Meaning of life
CONNECT:
– Subscribe to this YouTube channel
– Twitter: https://twitter.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman
– Medium: https://medium.com/@lexfridman
– Support on Patreon: https://www.patreon.com/lexfridman
Source
I really enjoyed this conversation with David. Here's the outline:
0:00 – Introduction
4:09 – First program
11:11 – AlphaGo
21:42 – Rule of the game of Go
25:37 – Reinforcement learning: personal journey
30:15 – What is reinforcement learning?
43:51 – AlphaGo (continued)
53:40 – Supervised learning and self play in AlphaGo
1:06:12 – Lee Sedol retirement from Go play
1:08:57 – Garry Kasparov
1:14:10 – Alpha Zero and self play
1:31:29 – Creativity in AlphaZero
1:35:21 – AlphaZero applications
1:37:59 – Reward functions
1:40:51 – Meaning of life
We are seeing the tip of the spear here, folks. Be vigilant.
I am struck by how small the audience is for this astonishing talk. It is so important that it should number in the millions, even billions.
I dare say this was THE most interesting episode so far! Deep learning is solving perception big time but it seems to me that (deep) RL will solve the cognition part of the equation.
… 11:11 alfa go… THERE IS NO ARTIFICIAL INTELLIGENCE…
It either IS, or IT, is NOT, INTELLIGENT. The pursuit for A.I. is a static-flux, and the ultimate example and final proof of human foolishness and folly. AND, like CERN, the expenditure and squanderance of energy ALONE is exquisite, and will culminate in the creation of a NON- ENTITY that will ONLY prove the existence of GOD …
Miguel Nicolelis: https://www.google.com/search?client=firefox-b-e&q=most+important+neuroscientist
Could quantum computers possibly make a better exploration of the game tree (in superpositions) and hence become a better AI?
Sphongle goes well with these podcasts.
do you sleep in tuxedo?
Super interesting. The concepts he's talking about and tackling are fascinating. However if you can pull back and look at these ideas from a big picture perspective. It's pretty scary to me all the implications and consequences of building systems that will mimic human intelligence but will have many times the processing power… Humans are a scary species as it is. Feels like they're going to build much scarier versions of humans without eliminating the monumental flaws we inherently have.
Awesome interview. I start jumping around with excitement. Get so eager to learn more!
How can he be sure AlphaGo was having "moments of delusion" and not using insanely creative strategies that humans couldn't grasp…
So, Lex, can you in AI apply the concept of sortition with people to re organize the political landscape of the U.S.?I'm referring to your conversations with Eric Weinstein about running for president…we don't need a president we need …
1:03:50 AlphaGo vs Lee Sedol was just a glorifed fuzz testing session.
thank you for this interview
1:34:00 humans learned new, more general maximas, by looking at the AI playing.
Why not try this in martial arts? :p (kind of the science based martial art in the movie Equilibrium)
All martial arts were developed incrementally, and it's hard to experiment new global maximas without getting beaten by fighters well trained in the common way.
I've seen connected ideas (without the AI part of course) in several places:
– Joe Rogan about the way to use taekwondo in MMA
– the mode of practice of Systema
– the mode of Practice of Shinbukan
Very enlightening thanks.
Hey man, awesome interviews! You seems to be a really good person. Thank you for what you are doing.
Harold Finch as a younger man, before building the Machine
An idiot savant. This tech will facilitate the singularity, but before we get there it will be misused by politicians to enable the death of privacy, democracy and human freedom on a scale that Silver is unable to understand.
and here I was thinking 'self play' is what I do in bed every night, boy was I wrong!!
Can this Alpha Zero figure out how to deal with coronavirus?
Answer to 1:40:51 – Meaning of life is deep and profound with many layers!
This talk is so inspiring.
I would be interested how you can use reinforced learning for real world problems without "playing through" millions of variations and where not all significant parameters are known and therefore the outcome is not 100% replicable. So the atari game is not different than the go game I think.
I also wouldn't say finding a new placement of a go stone is "creativity" … as the goal was to beat humans, obviously there was a high likelyhood it will find new moves that humans hadn't considered much before. In my opinion creativity is not when a comupter finds an optimal solutions to a given, specific game. Creativity is when a computer starts to create a new game with new rules without being told to do it.