Videos

David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | AI Podcast #86 with Lex Fridman



Lex Fridman

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. This conversation is part of the Artificial Intelligence podcast.

Support this podcast by signing up with these sponsors:
– MasterClass: https://masterclass.com/lex
– Cash App – use code “LexPodcast” and download:
– Cash App (App Store): https://apple.co/2sPrUHe
– Cash App (Google Play): https://bit.ly/2MlvP5w

EPISODE LINKS:
Reinforcement learning (book): https://amzn.to/2Jwp5zG

INFO:
Podcast website:
https://lexfridman.com/ai
Apple Podcasts:
https://apple.co/2lwqZIr
Spotify:
https://spoti.fi/2nEwCF8
RSS:
https://lexfridman.com/category/ai/feed/
Full episodes playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
Clips playlist:
https://www.youtube.com/playlist?list=PLrAXtmErZgOeciFP3CBCIEElOJeitOr41

OUTLINE:
0:00 – Introduction
4:09 – First program
11:11 – AlphaGo
21:42 – Rule of the game of Go
25:37 – Reinforcement learning: personal journey
30:15 – What is reinforcement learning?
43:51 – AlphaGo (continued)
53:40 – Supervised learning and self play in AlphaGo
1:06:12 – Lee Sedol retirement from Go play
1:08:57 – Garry Kasparov
1:14:10 – Alpha Zero and self play
1:31:29 – Creativity in AlphaZero
1:35:21 – AlphaZero applications
1:37:59 – Reward functions
1:40:51 – Meaning of life

CONNECT:
– Subscribe to this YouTube channel
– Twitter: https://twitter.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman
– Medium: https://medium.com/@lexfridman
– Support on Patreon: https://www.patreon.com/lexfridman

Source

Similar Posts

25 thoughts on “David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning | AI Podcast #86 with Lex Fridman
  1. I really enjoyed this conversation with David. Here's the outline:
    0:00 – Introduction
    4:09 – First program
    11:11 – AlphaGo
    21:42 – Rule of the game of Go
    25:37 – Reinforcement learning: personal journey
    30:15 – What is reinforcement learning?
    43:51 – AlphaGo (continued)
    53:40 – Supervised learning and self play in AlphaGo
    1:06:12 – Lee Sedol retirement from Go play
    1:08:57 – Garry Kasparov
    1:14:10 – Alpha Zero and self play
    1:31:29 – Creativity in AlphaZero
    1:35:21 – AlphaZero applications
    1:37:59 – Reward functions
    1:40:51 – Meaning of life

  2. I am struck by how small the audience is for this astonishing talk. It is so important that it should number in the millions, even billions.

  3. I dare say this was THE most interesting episode so far! Deep learning is solving perception big time but it seems to me that (deep) RL will solve the cognition part of the equation.

  4. 11:11 alfa go… THERE IS NO ARTIFICIAL INTELLIGENCE…
    It either IS, or IT, is NOT, INTELLIGENT. The pursuit for A.I. is a static-flux, and the ultimate example and final proof of human foolishness and folly. AND, like CERN, the expenditure and squanderance of energy ALONE is exquisite, and will culminate in the creation of a NON- ENTITY that will ONLY prove the existence of GOD …

  5. Super interesting. The concepts he's talking about and tackling are fascinating. However if you can pull back and look at these ideas from a big picture perspective. It's pretty scary to me all the implications and consequences of building systems that will mimic human intelligence but will have many times the processing power… Humans are a scary species as it is. Feels like they're going to build much scarier versions of humans without eliminating the monumental flaws we inherently have.

  6. How can he be sure AlphaGo was having "moments of delusion" and not using insanely creative strategies that humans couldn't grasp…

  7. So, Lex, can you in AI apply the concept of sortition with people to re organize the political landscape of the U.S.?I'm referring to your conversations with Eric Weinstein about running for president…we don't need a president we need …

  8. 1:34:00 humans learned new, more general maximas, by looking at the AI playing.
    Why not try this in martial arts? :p (kind of the science based martial art in the movie Equilibrium)

    All martial arts were developed incrementally, and it's hard to experiment new global maximas without getting beaten by fighters well trained in the common way.

    I've seen connected ideas (without the AI part of course) in several places:
    – Joe Rogan about the way to use taekwondo in MMA
    – the mode of practice of Systema
    – the mode of Practice of Shinbukan

  9. An idiot savant. This tech will facilitate the singularity, but before we get there it will be misused by politicians to enable the death of privacy, democracy and human freedom on a scale that Silver is unable to understand.

  10. I would be interested how you can use reinforced learning for real world problems without "playing through" millions of variations and where not all significant parameters are known and therefore the outcome is not 100% replicable. So the atari game is not different than the go game I think.

    I also wouldn't say finding a new placement of a go stone is "creativity" … as the goal was to beat humans, obviously there was a high likelyhood it will find new moves that humans hadn't considered much before. In my opinion creativity is not when a comupter finds an optimal solutions to a given, specific game. Creativity is when a computer starts to create a new game with new rules without being told to do it.

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com