Videos

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)



Lex Fridman

First lecture of MIT course 6.S091: Deep Reinforcement Learning, introducing the fascinating field of Deep RL. For more lecture videos on deep learning, reinforcement learning (RL), artificial intelligence (AI & AGI), and podcast conversations, visit our website or follow TensorFlow code tutorials on our GitHub repo.

INFO:
Website: https://deeplearning.mit.edu
GitHub: https://github.com/lexfridman/mit-deep-learning
Slides: http://bit.ly/2HtcoHV
Playlist: http://bit.ly/deep-learning-playlist

OUTLINE:
0:00 – Introduction
2:14 – Types of learning
6:35 – Reinforcement learning in humans
8:22 – What can be learned from data?
12:15 – Reinforcement learning framework
14:06 – Challenge for RL in real-world applications
15:40 – Component of an RL agent
17:42 – Example: robot in a room
23:05 – AI safety and unintended consequences
26:21 – Examples of RL systems
29:52 – Takeaways for real-world impact
31:25 – 3 types of RL: model-based, value-based, policy-based
35:28 – Q-learning
38:40 – Deep Q-Networks (DQN)
48:00 – Policy Gradient (PG)
50:36 – Advantage Actor-Critic (A2C & A3C)
52:52 – Deep Deterministic Policy Gradient (DDPG)
54:12 – Policy Optimization (TRPO and PPO)
56:03 – AlphaZero
1:00:50 – Deep RL in real-world applications
1:03:09 – Closing the RL simulation gap
1:04:44 – Next step in Deep RL

CONNECT:
– If you enjoyed this video, please subscribe to this channel.
– Twitter: https://twitter.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman

Source

Similar Posts

23 thoughts on “MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)
  1. Another detail I have noticed in many presentations … those agents are not trying to model the environment … that is semantically impossible … what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance …

  2. Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.

  3. Hmm, is it easy to summarise or direct to what you find valuable about Nietzsche's writings? My impression is that, while I might agree with him about some things, he doesn't contribute much that is both novel and helpful. And quotes like the following makes him seem kind of …psychopathic?:

    > I abhor the man’s vulgarity when he says “What is right for one man is right for another”; “Do not to others that which you would not that they should do unto you.”. . . . The hypothesis here is ignoble to the last degree: it is taken for granted that there is some sort of equivalence in value between my actions and thine.

    > I do not point to the evil and pain of existence with the finger of reproach, but rather entertain the hope that life may one day become more evil and more full of suffering than it has ever been.

    > Man shall be trained for war and woman for the recreation of the warrior. All else is folly

    But I haven't read any of his books, only heard summaries and quotes, so perhaps I'm missing something or misunderstanding him somewhat.

    Bellow are some examples of texts that I personally would recommend ?

    * https://nickbostrom.com/utopia.html
    * https://reducing-suffering.org/on-the-seriousness-of-suffering/
    * https://wiki.lesswrong.com/wiki/Coherent_Extrapolated_Volition

  4. The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing!
    Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!

  5. I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on YouTube. Thank you!

  6. Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285.

    Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com