Prolego
GPT-3 is the biggest language ever model built, and it has been attracting a lot of attention. Rather than argue about whether GPT-3 is overhyped or not, we wanted to dig in to the literature and understand what GPT-3 is (and is not) in light of it’s predecessors and alternative transformer models. In this video we share some of what we’ve learned. What is GPT-3 really good at? What are its constraints? How useful is it for business? Enjoy!
⏰ Time Stamps ⏰
00:40 – Comparison of latest Natural Language Processing Models
01:09 – What is a Transformer Model
01:50 – The Two Types of Transformer Models
02:15 – Difference between bi-directional encoders (BERT) and autoregressive decoders (GPT)
04:40 – GPT-3 is HUGE, does size matter?
05:24 – Presentation of size differences between GPT-3 relative to BERT, RoBERTa, GPT-2, and T5
07:40 – What does GPT do and how is it different than the BERT family?
18:05 – Is GPT-3 a Child Prodigy or a Parlor Trick?
18:44 – Back to the Issue of GPT-3’s size
19:30 – Final thoughts on GPT-3 vs BERT
The camera panning in and out is kinda distracting.. please keep it steady
Great vid bud. Great to see more content like this.
Cheers.