MIT AI: OpenAI Meta-Learning and Self-Play (Ilya Sutskever)

March 4, 2019Artis Modus

Lex Fridman

This is a talk by Ilya Sutskever for course 6.S099: Artificial General Intelligence. He is the Co-Founder of OpenAI. This class is free and open to everyone. Our goal is to take an engineering approach to exploring possible paths toward building human-level intelligence for a better world.

OUTLINE:
0:00 – Introduction
0:55 – Talk
43:04 – Q&A

INFO:
Course website: https://agi.mit.edu
Contact: agi@mit.edu
Playlist: http://bit.ly/2EcbaKf

CONNECT:
– AI Podcast: https://lexfridman.com/ai/
– Subscribe to this YouTube channel
– LinkedIn: https://www.linkedin.com/in/lexfridman
– Twitter: https://twitter.com/lexfridman
– Facebook: https://www.facebook.com/lexfridman
– Instagram: https://www.instagram.com/lexfridman
– Slack: https://deep-mit-slack.herokuapp.com

Source

Similar Posts

50 thoughts on “MIT AI: OpenAI Meta-Learning and Self-Play (Ilya Sutskever)”

John Forbes says:

April 26, 2018 at 5:35 pm

Favourite so far.
Андрей Пахолков says:

April 26, 2018 at 6:02 pm

All the best people in AI on you course!
Krishna Harish says:

April 26, 2018 at 8:29 pm

??
Siraj Raval says:

April 27, 2018 at 2:58 am

this is gold
Nicholas Cantrell says:

April 27, 2018 at 5:01 am

I appreciate the reminder that digital representations of ANN are really digital circuits.
daniel f says:

April 27, 2018 at 6:50 am

This was a-m-a-z-i-n-g
huuud says:

April 27, 2018 at 8:13 am

Great talk!,thanks for posting
Rowan Gontier says:

April 27, 2018 at 5:55 pm

Pete Sampras of AI.
Jätski.fi says:

April 27, 2018 at 9:39 pm

Try A* SOM
Vasya Pupkin says:

April 28, 2018 at 12:32 am

Найс
umeahalla says:

April 28, 2018 at 4:13 am

Wow really cool and summarized in a profound compact way! Thanks for talking and sharing this online.
Tiwaking Tiwaking says:

April 29, 2018 at 4:27 pm

your dota bot got fucked up by normal players.
JoseloSoft says:

April 29, 2018 at 10:31 pm

This is great, thanks a lot.
Lava Kafle says:

April 29, 2018 at 10:34 pm

This is a talk by Ilya Sutskever for course 6.S099: Artificial General Intelligence. He is the Co-Founder of OpenAI. This class is free and open to everyone. Our goal is to take an engineering approach to exploring possible paths toward building human-level intelligence for a better world.
pkScary says:

May 1, 2018 at 2:49 pm

This should have way more views. Grand talk.
Neuralink Kid says:

May 2, 2018 at 1:30 am

Not all heroes wear capes.. Ilya is one of the most underrated thinkers in AI right now.
Satyendra Kumar says:

May 2, 2018 at 11:09 am

Very true and insightful..we reward ourselves, environment doesn't
Duane Nielsen says:

May 2, 2018 at 12:45 pm

Awesome! So many topics clearly and concisely explained.
Swapnil Bhadade says:

May 3, 2018 at 12:23 pm

off policy learning 100 likes
Zheng Cheng says:

May 3, 2018 at 1:43 pm

Great video!
Billy Bingo says:

May 3, 2018 at 8:23 pm

joffery is that you?
hoolerboris says:

May 5, 2018 at 2:26 am

At 44:00 he says backpropagation solves circuit search. What problem is he talking about? Anyone got references to this backpropagation and circuit search thing?
Ahmad M. says:

May 5, 2018 at 12:23 pm

some good insight on DL from Ilya !
Karl Pages says:

May 5, 2018 at 3:27 pm

good talk on elaborating on some home truths.

thanks for the vid 🙂
sabo fx says:

May 6, 2018 at 12:41 pm

Thanx guys! Great presentation!
TheAlphazeta09 says:

May 10, 2018 at 7:52 pm

"The only real reward is existence and non-existence. Everything else is a corollary of that". Damn. That's deep.
I Leena says:

May 12, 2018 at 9:25 pm

Reaching to learn…?
trylks is trylks says:

May 21, 2018 at 3:10 am

8:25 "And there is only one real true reward in life, and this is existence or non-existence, and everything else is a corollary of that." OK, that was _deep_. I would say surviving is a shared necessary condition that has many implications and that it could lead to a new era of better politics, if it got the attention it deserves. And I would not say that everything else is "a corollary", but I agree to a good extent. The video is awesome, it is just that this point may be the most important, although it is one not strongly related to machine learning.
judgeomega says:

May 24, 2018 at 7:56 pm

i dont understand how 'the shortest solution' can even be considered. it seems nonsensical. symbols mean what we define them to mean. you could define the letter x as the shortest solution to solve a problem and the size of your program is one byte.
Eva Martane says:

May 26, 2018 at 4:49 am

woooooooooooooooooowwwwww
Alfonsas Juršėnas says:

May 27, 2018 at 5:46 am

The best talk related to AGI I have seen so far.
vaibhav chaturvedy says:

May 30, 2018 at 12:28 am

Self Play reminds of dual-simplex algorithm.
haltux says:

June 1, 2018 at 9:59 pm

Nice talk, but I am a bit disappointed by the speculations on strong AI. In particular the slides at 39:30 (taken from somewhere else) are incredibly misleading. I know it is supposed to be funny but still it is a mistake to show that.
David Alexander says:

June 29, 2018 at 7:05 am

man this guy is a fucking genius! somehow elon is like a black hole attracting the smartest people on earth to gravitate around him
Daniel Hook says:

July 17, 2018 at 6:48 pm

Usually I regret watching the Q&A part of talks, but this one was excellent.
mswai says:

July 19, 2018 at 8:54 pm

very insightful and breaks it down to terms even I can grasp. Thank you for this amazing video.
Eroid says:

July 28, 2018 at 11:31 am

THIS IS WHAT I ALWAYS WANTED! I never knew something like this existed and thought that people simply didn't work on it or it didn't exist but it's actually real! META LEARNING! I always thought I would have to try learning how to achieve this myself after learning all the required math, but other people have already worked on it! This is really inspiring. I really hope well be able to achieve artificial general intelligence with improvements in this field.
Ross' Tube says:

September 9, 2018 at 4:50 am

On the Q&A question on backprop and biological plausibility of DL. This talk at ICLR 2018 by Blake Richards was on this topic, and very interesting. TLDW there are viable credit assignment alternatives to backprop that are more biologically plausible https://www.youtube.com/watch?v=YUVLgccVi54
Peter Kishalov says:

September 14, 2018 at 5:59 pm

The talk is amazing! I just heard the zombie sounds from the humanoid figures playing soccer. Or is it just imagination?
QuaLeQuiRe Dream In Dream Out says:

September 18, 2018 at 1:42 am

Thanks.
miraculixx says:

October 22, 2018 at 5:42 pm

We should always account for the fact that all results and "emergent behavior" (i.e. learnt, not programmed) so far are results of computation, not intelligence. In other words what we see are at best automated simulations of expected (by humans) behaviors, performed by some human-designed system. Even though results are surprising and some are truly amazing, there is nothing like consciousness, self-awaress, creativity, ability to abstract and reason, logic or ability to self-motivate, all of which are aspects of human intelligence. The field should be called Automated Learning or Advanced Problem Optimization. To use the term A.I. is really a misnomer and communicates unrealistic expectations.
alexandre zhang says:

November 8, 2018 at 10:13 pm

Thank you for sharing so good resources!!!!
Abdurahim Abdulkadir says:

November 30, 2018 at 6:45 pm

Thanks for sharing ?
fane aziz says:

December 1, 2018 at 5:26 pm

The best ever intro to AI
James says:

December 12, 2018 at 7:07 am

Thank you so much for posting these videos. Really appreciate how MIT has a long tradition of sharing and disseminating knowledge.
Petr Morgoun says:

December 22, 2018 at 8:53 pm

I want to sit in that lecture hall
Michael Tamillow says:

December 28, 2018 at 4:51 am

"Computers will have an advantage in every domain." – have to ask, I imagine you mean every well defined physical domain that can be explained by immediate sensory input, right? Almost all of what we have created in recent decades has been layer upon layer of abstraction that extends far beyond our immediate physical presence. Almost certainly that trend will continue, and humans will master the abstractions that they are forced to specify to machines.
Conor Cosnett says:

January 5, 2019 at 7:42 pm

What does he mean by a "small circuit"?
Aleksa Gordic says:

January 10, 2019 at 4:26 am

Theory:

0:00 introduction & supervised learning (using neural nets/deep learning)

6:45 reinforcement learning (model-free (2 types) => 1. policy gradients 2. Q-learning based)

12:55 meta-learning (learning to learn)

Applications:

16:00 HER (hindsight experience replay) algo (learn from failures)

21:40 Sim2Real using meta-learning (train a policy that can adapt to different simulation params => quickly adapts to the real world)

25:30 Learning a hierarchy of actions with meta-learning

28:20 Limitation of meta-learning => assumption: training distribution == test distribution

29:40 self-play technique (TD-Gammon, AlphaGo Zero, Dota 2 bot)

37:00 can we train AGI using the self-play?

39:35 learning from human feedback/conveying goals to agents (artificial leg doing salto example)

Questions:

43:00 Does human brain use backprop?

45:15 dota bot question

47:22 standard deviation (maximize expected reward vs minimize std dev)

48:27 cooperation as motivation for the agents?

49:40 open complexity theoretic problems could help AI?

51:20 the most productive research trajectories towards generative language models?

53:30 do you work on evolutionary strategies (for solving RL problems) in OpenAI?

54:25 could you elaborate on "right goal is a political problem"?

55:42 do we need a really good model of the physical world in order to have real-world capable agents?

57:18 solving the problem of self-organization?

58:45 follow up: self-organization in a non-competitive environment?

my observation:

42:30 It seems to me that the most difficult problem, which we will face, will be to communicate, effectively, the "right" goals to the AI in a way so that we can somewhat predict it's future behaviour, or better said it's worst case behaviour (safety implications). After all we don't want HAL 9000 type of AI's 🙂
Alexander Bollbach says:

January 19, 2019 at 10:39 am

seems like Elon a bit

Comments are closed.