Henry AI Labs
This video will explore the exciting new 6.8 Billion parameter ImageGPT model! The researchers show that better and larger generative models learn better representations for tasks like ImageNet classification!
Thanks for watching! Please Subscribe!
Paper Links:
ImageGPT (Blog Post): https://openai.com/blog/image-gpt/
ImageGPT (Paper): https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf
A Survey of Long-term Context in Transformers: https://www.pragmatic.ml/a-survey-of-methods-for-incorporating-long-term-context/
Google TPUs: https://cloud.google.com/tpu/docs/tpus
The Illustrated Transformer: http://jalammar.github.io/illustrated-transformer/
PixelCNN: https://keras.io/examples/generative/pixelcnn/
PixelCNN (Paper): https://arxiv.org/pdf/1606.05328.pdf
Contrastive Predictive Coding: https://arxiv.org/pdf/1905.09272.pdf
Big BiGAN: https://arxiv.org/pdf/1907.02544.pdf
BERT: https://arxiv.org/pdf/1810.04805.pdf
Rethinking Pre-training and Self-Training: https://arxiv.org/pdf/2006.06882.pdf
wanna collab?
2:18 Auto-Regressive modeling of Pixels
4:18 Denoising Autoencoders: AR and BERT
5:40 GPT Architecture, No CNN Prior!
7:00 6.8 BILLION parameters!! Comparison with SimCLR, CPC, BigBiGAN
8:24 Generative Models and Representation Learning for Vision
10:30 Fine-Tuning with Linear Probes
11:50 Working around Quadratic Complexity of Self-Attention
12:50 Context Reduction
13:52 Results and Ablations
18:50 Promise of Longer Context Transformers and Visual Representation Learning
Awesome video!
Awesome content! Thanks!
😩 too awesome i can't even process
Good job!
Great job. We need colab tutorials.
That imageGPT result is crazy. It seems that you can replace inductive biases (translation invariance via convolutions) with just more data and compute.
Yannic Kilcher sent me here. Good channel. Subbed!
Awesome stuff. Have to watch it a couple times to wrap my head around it.
Can u use plain English please ,it still sounds complex for bigginners
👏👏👏👏👏👏👏👍👌