MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

August 25, 2019Artis Modus

Lex Fridman

First lecture of MIT course 6.S091: Deep Reinforcement Learning, introducing the fascinating field of Deep RL. For more lecture videos on deep learning, reinforcement learning (RL), artificial intelligence (AI & AGI), and podcast conversations, visit our website or follow TensorFlow code tutorials on our GitHub repo.

INFO:
Website: https://deeplearning.mit.edu
GitHub: https://github.com/lexfridman/mit-deep-learning
Slides: http://bit.ly/2HtcoHV
Playlist: http://bit.ly/deep-learning-playlist

OUTLINE:
0:00 – Introduction
2:14 – Types of learning
6:35 – Reinforcement learning in humans
8:22 – What can be learned from data?
12:15 – Reinforcement learning framework
14:06 – Challenge for RL in real-world applications
15:40 – Component of an RL agent
17:42 – Example: robot in a room
23:05 – AI safety and unintended consequences
26:21 – Examples of RL systems
29:52 – Takeaways for real-world impact
31:25 – 3 types of RL: model-based, value-based, policy-based
35:28 – Q-learning
38:40 – Deep Q-Networks (DQN)
48:00 – Policy Gradient (PG)
50:36 – Advantage Actor-Critic (A2C & A3C)
52:52 – Deep Deterministic Policy Gradient (DDPG)
54:12 – Policy Optimization (TRPO and PPO)
56:03 – AlphaZero
1:00:50 – Deep RL in real-world applications
1:03:09 – Closing the RL simulation gap
1:04:44 – Next step in Deep RL

CONNECT:
– If you enjoyed this video, please subscribe to this channel.
– Twitter: https://twitter.com/lexfridman
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman

Source

Similar Posts

23 thoughts on “MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)”

Samuel Ferrer says:

March 17, 2019 at 12:53 pm

Another detail I have noticed in many presentations … those agents are not trying to model the environment … that is semantically impossible … what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance …
Kai Wang says:

March 18, 2019 at 7:32 am

Wonderful lecture.
Abel Arredondo says:

March 19, 2019 at 12:19 pm

Remove the human factor. Have the traffic be free of human crossing
Samuel Laferriere says:

March 20, 2019 at 9:35 pm

@50:06 DQN can't learn stochastic policies. DQN has a softmax output on actions… isn't that a stochastic policy in itself?
Samuel Schmidgall says:

March 24, 2019 at 12:54 pm

Seriously the best Deep RL lecture out there to date.
A Abkhcdcz says:

March 27, 2019 at 5:26 am

شكرا جزيلا
imago says:

March 29, 2019 at 7:53 am

How/why can you even upload this for free? Doesn't university cost loads in the US?

Great stuff though!
imago says:

March 29, 2019 at 8:14 am

Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.
Tor Barstad says:

April 18, 2019 at 12:43 am

Hmm, is it easy to summarise or direct to what you find valuable about Nietzsche's writings? My impression is that, while I might agree with him about some things, he doesn't contribute much that is both novel and helpful. And quotes like the following makes him seem kind of …psychopathic?:

> I abhor the man’s vulgarity when he says “What is right for one man is right for another”; “Do not to others that which you would not that they should do unto you.”. . . . The hypothesis here is ignoble to the last degree: it is taken for granted that there is some sort of equivalence in value between my actions and thine.

> I do not point to the evil and pain of existence with the finger of reproach, but rather entertain the hope that life may one day become more evil and more full of suffering than it has ever been.

> Man shall be trained for war and woman for the recreation of the warrior. All else is folly

But I haven't read any of his books, only heard summaries and quotes, so perhaps I'm missing something or misunderstanding him somewhat.

Bellow are some examples of texts that I personally would recommend ?

* https://nickbostrom.com/utopia.html
* https://reducing-suffering.org/on-the-seriousness-of-suffering/
* https://wiki.lesswrong.com/wiki/Coherent_Extrapolated_Volition
Giraffe Tamer says:

May 3, 2019 at 1:29 pm

Which Nietzsche book is he recommending at 4:12 ?
Gaius Baltar says:

May 4, 2019 at 4:49 pm

Trump 2020
John MacLeod says:

May 11, 2019 at 12:32 am

Brilliant!!
Kevin K says:

May 12, 2019 at 6:25 pm

Lex is honestly a character from a Wes Anderson film.
Elliot V says:

May 13, 2019 at 2:36 am

THANK YOU MIT
Arghya Chatterjee says:

May 24, 2019 at 12:19 pm

Lex Fridman, I just love your videos. I am your great fan sir. Carry on.
Stefan Mandl says:

June 17, 2019 at 7:07 am

Hi Lex, thanks for this great lecture! Which books of Nietzsche did you have on your mind around 4:33?
Yahya Almardeny says:

July 5, 2019 at 3:54 am

The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing!
Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!
Tarun paparaju says:

July 11, 2019 at 2:59 am

I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on YouTube. Thank you!
Stabgan says:

July 12, 2019 at 11:52 pm

You are my idol lex
Ryden Gilani says:

July 16, 2019 at 12:15 pm

I've seen a lot of these videos & read some of the books in ML; Lex has a clarity thats rare
gururaj raj says:

August 10, 2019 at 9:33 pm

Super
Akarsh Rastogi says:

August 13, 2019 at 4:26 am

Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285.

Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..
Akarsh Rastogi says:

August 13, 2019 at 5:53 am

1:04:40 Best part, that grin after he just casually dropped that line in an MIT lecture.. All of infinite universes being Simulations

Comments are closed.