Image GPT: Generative Pretraining from Pixels (Paper Explained) – The Engineering of Conscious Experience

Yannic Kilcher

BERT and GPT-2/3 have shown the enormous power of using generative models as pre-training for classification tasks. However, for images, pre-training is usually done with supervised or self-supervised objectives. This paper investigates how far you can get when applying the principles from the world of NLP to the world of images.

OUTLINE:
0:00 – Intro & Overview
2:50 – Generative Models for Pretraining
4:50 – Pretraining for Visual Tasks
7:40 – Model Architecture
15:15 – Linear Probe Experiments
24:15 – Fine-Tuning Experiments
30:25 – Conclusion & Comments

Paper:
https://cdn.openai.com/papers/Generative_Pretraining_from_Pixels_V2.pdf
Blog: https://openai.com/blog/image-gpt/
Code: https://github.com/openai/image-gpt

Abstract:
Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images. We train a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure. Despite training on low-resolution ImageNet without labels, we find that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification. On CIFAR-10, we achieve 96.3% accuracy with a linear probe, outperforming a supervised Wide ResNet, and 99.0% accuracy with full finetuning, matching the top supervised pre-trained models. An even larger model trained on a mixture of ImageNet and web images is competitive with self-supervised benchmarks on ImageNet, achieving 72.0% top-1 accuracy on a linear probe of our features.

Authors: Mark Chen, Alec Radford, Rewon Child, Jeff Wu, Heewoo Jun, Prafulla Dhariwal, David Luan, Ilya Sutskever

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

<iframe></p> <p><a href="https://www.youtube.com/watch?v=YBlNQK0Ao6g">Source</a></p> <div class="be1e40beae42d993bafb8643f4ddde8b" data-index="3" style="float: none; margin:10px 0 10px 0; text-align:center;"> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <ins class="adsbygoogle" style="display:block; text-align:center;" data-ad-layout="in-article" data-ad-format="fluid" data-ad-client="ca-pub-9244112244416304" data-ad-slot="4549240677"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <div style="font-size: 0px; height: 0px; line-height: 0px; margin: 0; padding: 0; clear: both;"></div> </div> </article> <div class="clearfix"></div> <ul class="default-theme-post-navigation"> <li class="theme-nav-previous"><a href="https://theengineeringofconsciousexperience.com/indias-reliance-jio-platforms-to-sell-1-5-billion-stake-to-saudi-arabias-public-investment-fund-techcrunch/" rel="prev"><span class="meta-nav">←</span> India’s Reliance Jio Platforms to sell $1.5 billion stake to Saudi Arabia’s Public Investment Fund – TechCrunch</a></li> <li class="theme-nav-next"><a href="https://theengineeringofconsciousexperience.com/satisfying-art-that-will-relax-you-before-sleep-calligraphy-lettering/" rel="next">Satisfying ART That Will Relax You Before Sleep | Calligraphy & Lettering <span class="meta-nav">→</span></a></li> </ul> <div class="clearfix"></div> <h3 class='comment-reply-title'>Similar Posts</h3> <div class="mb-related-posts mb-simple-featured-posts mb-simple-featured-posts-wrap row"> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/openais-gpt-3/" aria-hidden="true" tabindex="-1"> <img width="501" height="282" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault.jpg 1280w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault-300x169.jpg 300w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault-1024x576.jpg 1024w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault-768x432.jpg 768w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/12/1607149257_maxresdefault-520x293.jpg 520w" sizes="(max-width: 501px) 100vw, 501px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/openais-gpt-3/" rel="bookmark">OpenAI's GPT-3</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/openais-gpt-3/" rel="bookmark"><time class="entry-date published updated" datetime="2020-11-27T10:02:23-07:00">November 27, 2020</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/ef7070c247ba14505b3277771aa93cf6/">Michael Kyer</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/features-in-depth-product-descriptions-for-ecommerce-with-gpt-3-content-villain/" aria-hidden="true" tabindex="-1"> <img width="501" height="282" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault.jpg 1280w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault-300x169.jpg 300w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault-1024x576.jpg 1024w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault-768x432.jpg 768w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1622868362_maxresdefault-520x293.jpg 520w" sizes="(max-width: 501px) 100vw, 501px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/features-in-depth-product-descriptions-for-ecommerce-with-gpt-3-content-villain/" rel="bookmark">Features in Depth – Product Descriptions for Ecommerce with GPT-3 – Content Villain</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/features-in-depth-product-descriptions-for-ecommerce-with-gpt-3-content-villain/" rel="bookmark"><time class="entry-date published updated" datetime="2021-03-11T21:57:58-07:00">March 11, 2021</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/bef96155721f1c620e64e5169773081e/">Stuart Lansdale</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/openai-gpt-3-research-and-discovery-understandingwebwithisnowglobal/" aria-hidden="true" tabindex="-1"> <img width="501" height="282" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault.jpg 1280w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault-300x169.jpg 300w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault-1024x576.jpg 1024w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault-768x432.jpg 768w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1600887710_maxresdefault-520x293.jpg 520w" sizes="(max-width: 501px) 100vw, 501px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/openai-gpt-3-research-and-discovery-understandingwebwithisnowglobal/" rel="bookmark">OpenAI GPT-3 Research and Discovery – #UnderstandingWebWithIsNowGlobal</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/openai-gpt-3-research-and-discovery-understandingwebwithisnowglobal/" rel="bookmark"><time class="entry-date published updated" datetime="2020-08-12T07:30:00-07:00">August 12, 2020</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/55dff33ad2cef2c1d784491f2f52c5dd/">IsNowGlobal</a></span></span> </div> </header> </article> </div> <div id="comments" class="comments-area"> <h5 class="comments-title"> 25 thoughts on “<span>Image GPT: Generative Pretraining from Pixels (Paper Explained)</span>” </h5> <ol class="comment-list"> <li id="comment-171722" class="comment even thread-even depth-1"> <article id="div-comment-171722" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCxCmbiJ-xuvf_QyB1nAavVg" class="url" rel="ugc external nofollow">gatoatigrado1</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171722"><time datetime="2020-06-18T20:26:24-07:00">June 18, 2020 at 8:26 pm</time></a> </div> </footer> <div class="comment-content"> <p>hmm, I don't think your comment about linear probing after fine-tuning is likely to help much. iiuc the linear probe accuracy at the last layer should re-discover the fine-tuning result (the 99% accuracy). It seems pretty unlikely (though not impossible) that removing later layers would help, unless you think the model is going to add too much noise in these layers and destroy signal from previous layers.</p> </div> </article> </li> <li id="comment-171721" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171721" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCJT56TROv88ZsIuDnQfPdzQ" class="url" rel="ugc external nofollow">Nelson Cárdenas Bolaño</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171721"><time datetime="2020-06-18T20:39:50-07:00">June 18, 2020 at 8:39 pm</time></a> </div> </footer> <div class="comment-content"> <p>I think you received the paper a week earlier than anyone else, cause you're so fast XD</p> </div> </article> </li> <li id="comment-171720" class="comment even thread-even depth-1"> <article id="div-comment-171720" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCzyC57MPddsodpzUVy_DGzw" class="url" rel="ugc external nofollow">Kazi Nazmul Haque Shezan</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171720"><time datetime="2020-06-19T01:11:37-07:00">June 19, 2020 at 1:11 am</time></a> </div> </footer> <div class="comment-content"> <p>Amazing Mate.</p> </div> </article> </li> <li id="comment-171719" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171719" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCy6p7U624D0tJy7-SHi440Q" class="url" rel="ugc external nofollow">Anjan Karmakar</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171719"><time datetime="2020-06-19T03:09:07-07:00">June 19, 2020 at 3:09 am</time></a> </div> </footer> <div class="comment-content"> <p>Man! You're so fast!</p> </div> </article> </li> <li id="comment-171718" class="comment even thread-even depth-1"> <article id="div-comment-171718" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCWE36V1L-N8yRAjQJgwSZ3Q" class="url" rel="ugc external nofollow">BLS</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171718"><time datetime="2020-06-19T05:30:24-07:00">June 19, 2020 at 5:30 am</time></a> </div> </footer> <div class="comment-content"> <p>Thanks Yannic for the insights! This paper came out recently<br />Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge<br /><a href="https://arxiv.org/abs/2006.06609">https://arxiv.org/abs/2006.06609</a></p> </div> </article> </li> <li id="comment-171717" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171717" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UClNKtdUYiNeSZSnxbpbaIaw" class="url" rel="ugc external nofollow">KS</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171717"><time datetime="2020-06-19T09:13:07-07:00">June 19, 2020 at 9:13 am</time></a> </div> </footer> <div class="comment-content"> <p>So BERT is an autoencoder objective so the only difference that they have here compared to people trying autoencoders ("back in the day!") for semi-supervised learning is self-attention and lots more data? Pretty nuts. I guess the fact the autoregressive GPT objective compared to the autoencoder objective is something.</p> </div> </article> </li> <li id="comment-171716" class="comment even thread-even depth-1"> <article id="div-comment-171716" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCk-utftqcHH0wkAK8I_FyJQ" class="url" rel="ugc external nofollow">Harsh Raj J</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171716"><time datetime="2020-06-19T11:08:13-07:00">June 19, 2020 at 11:08 am</time></a> </div> </footer> <div class="comment-content"> <p>i was suprised when the paper just came out and you already made a vid on it too. pro youtuber move… btw great explanation, love your content!</p> </div> </article> </li> <li id="comment-171715" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171715" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCxrR0xCKGHaGDAeaCZJA50g" class="url" rel="ugc external nofollow">Florian Hönicke</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171715"><time datetime="2020-06-19T12:15:34-07:00">June 19, 2020 at 12:15 pm</time></a> </div> </footer> <div class="comment-content"> <p>Great work!</p> </div> </article> </li> <li id="comment-171714" class="comment even thread-even depth-1"> <article id="div-comment-171714" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCKnrJdM5ZO1D4wdSaLFLDdg" class="url" rel="ugc external nofollow">Funky Town</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171714"><time datetime="2020-06-19T13:47:05-07:00">June 19, 2020 at 1:47 pm</time></a> </div> </footer> <div class="comment-content"> <p>With "rolled-out" pixels, the last known pixel always has relationships to pixels at each fixed distance away. E.g. given a 32×32 image, the pixel at -1 distance from the pixel to be predicted has similar relationship to the pixel at -32 distance (-1 vertically before "roll-out"). -2 is similar to -64, etc. But with language, there's no repeating 32-word pattern, and there's never a similar relationship between two words at two fixed distances away (maybe in poetry!). Is that fact build into the model before training, or is that a type of "image grammar" that's learned by lower layers?</p> </div> </article> </li> <li id="comment-171713" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171713" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCcMQqXM7fXnZZXqNmikQn5A" class="url" rel="ugc external nofollow">Glenn Kroegel</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171713"><time datetime="2020-06-20T01:31:39-07:00">June 20, 2020 at 1:31 am</time></a> </div> </footer> <div class="comment-content"> <p>I would like to see this done with sparse attention using the row and column for queries and keys. Maybe then you don't have to downsize the images so much.</p> </div> </article> </li> <li id="comment-171712" class="comment even thread-even depth-1"> <article id="div-comment-171712" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCrbBOTlZRL1gMOsJaT6Ggcw" class="url" rel="ugc external nofollow">Rick Vink</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171712"><time datetime="2020-06-21T03:11:30-07:00">June 21, 2020 at 3:11 am</time></a> </div> </footer> <div class="comment-content"> <p>So the quote "What I cannot create, I do not understand" hold also a bit for neural networks =).</p> </div> </article> </li> <li id="comment-171711" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171711" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UC7Ya5d7qIOkVBvfPi2izWCg" class="url" rel="ugc external nofollow">Laksh Aithani</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171711"><time datetime="2020-06-21T04:57:49-07:00">June 21, 2020 at 4:57 am</time></a> </div> </footer> <div class="comment-content"> <p>Any thoughts on why they didn't use recent work on sparser transformers to deal with long sequence length?</p> </div> </article> </li> <li id="comment-171710" class="comment even thread-even depth-1"> <article id="div-comment-171710" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCgA3maPcapqjsOOhDLL4Daw" class="url" rel="ugc external nofollow">Amit Singh Bhatti</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171710"><time datetime="2020-06-22T04:47:07-07:00">June 22, 2020 at 4:47 am</time></a> </div> </footer> <div class="comment-content"> <p>No offence to Henry AI Labs, but your explanations are very simplistic and you have a nice patient flow of explaining the paper(have been looking for it past some months).<br />Kudos Brother, you bought yourselves a subscriber today 🙂</p> </div> </article> </li> <li id="comment-171709" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171709" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCbiDGTw9TgSIcd88gB2CK5A" class="url" rel="ugc external nofollow">Mark Mifsud</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171709"><time datetime="2020-06-25T07:04:34-07:00">June 25, 2020 at 7:04 am</time></a> </div> </footer> <div class="comment-content"> <p>This is so amazing, it's fucked up. I'm glad I went to Uni to learn Computer Science 4 years ago (at age 38). This is stuff I can now get into more easily.</p> </div> </article> </li> <li id="comment-171708" class="comment even thread-even depth-1"> <article id="div-comment-171708" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCqSjFEtaxpguipoZdLDVVpg" class="url" rel="ugc external nofollow">Wobuffet3</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171708"><time datetime="2020-06-30T16:08:31-07:00">June 30, 2020 at 4:08 pm</time></a> </div> </footer> <div class="comment-content"> <p>I'm a dummy who isn't good at computers, how do I use this program?</p> </div> </article> </li> <li id="comment-171707" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171707" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UC_gHhbSO8cmdQ9AVF2EHs5g" class="url" rel="ugc external nofollow">Des S</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171707"><time datetime="2020-07-02T03:06:58-07:00">July 2, 2020 at 3:06 am</time></a> </div> </footer> <div class="comment-content"> <p>it's more gooder.</p> </div> </article> </li> <li id="comment-171706" class="comment even thread-even depth-1"> <article id="div-comment-171706" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCuj62pvMVxDIgYuoya5usGw" class="url" rel="ugc external nofollow">ХОРОШО</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171706"><time datetime="2020-07-24T06:08:37-07:00">July 24, 2020 at 6:08 am</time></a> </div> </footer> <div class="comment-content"> <p>Did it figured out by itself that cats can keep a sheet of paper in their paws? Or such kind of images are in the dataset?</p> </div> </article> </li> <li id="comment-171705" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171705" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCM6FLsHvvRqxDJIAcf31PQg" class="url" rel="ugc external nofollow">Benjamin Hathaway</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171705"><time datetime="2020-07-27T02:30:44-07:00">July 27, 2020 at 2:30 am</time></a> </div> </footer> <div class="comment-content"> <p>oh… here it is. :O Thank you!</p> </div> </article> </li> <li id="comment-171704" class="comment even thread-even depth-1"> <article id="div-comment-171704" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCM6FLsHvvRqxDJIAcf31PQg" class="url" rel="ugc external nofollow">Benjamin Hathaway</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171704"><time datetime="2020-07-27T02:55:37-07:00">July 27, 2020 at 2:55 am</time></a> </div> </footer> <div class="comment-content"> <p>So does random cropping induce a non localised storage patch of weights (in effect providing contrastive weight spaces), which then can then combine in a 'holographic manor' to contribute towards an answer.</p> </div> </article> </li> <li id="comment-171703" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171703" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCre9UbMYfXhiygFK91Bn-OA" class="url" rel="ugc external nofollow">Laszer271</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171703"><time datetime="2020-08-04T04:13:31-07:00">August 4, 2020 at 4:13 am</time></a> </div> </footer> <div class="comment-content"> <p><a href="https://www.youtube.com/watch?v=YBlNQK0Ao6g&t=31m00s">31:00</a> You could use Discriminator from GAN and I think that's the most common practice but it wouldn't be pixel by pixel. Autoregressive models also can use convolutions though (e.g. PixelCNN). They just kind of use half of a filter because they can't see what's ahead as that would be cheating 😛</p> </div> </article> </li> <li id="comment-171702" class="comment even thread-even depth-1"> <article id="div-comment-171702" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UC9GKY1rLot4K2GHFauEmefQ" class="url" rel="ugc external nofollow">Heath Mitchell</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171702"><time datetime="2020-08-04T13:46:55-07:00">August 4, 2020 at 1:46 pm</time></a> </div> </footer> <div class="comment-content"> <p>Could this be used for compression by only storing the pixel if it's different from what's expected?</p> </div> </article> </li> <li id="comment-171701" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171701" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCd9Fy4kF6F_iBg08wjUNlFA" class="url" rel="ugc external nofollow">bunch of nerds</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171701"><time datetime="2020-08-05T03:50:05-07:00">August 5, 2020 at 3:50 am</time></a> </div> </footer> <div class="comment-content"> <p>Yannic, It would be great if you tell pytorch implementation along with reading.</p> </div> </article> </li> <li id="comment-171700" class="comment even thread-even depth-1"> <article id="div-comment-171700" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UC6X8BJq_aP4kB_oOz0jxq1Q" class="url" rel="ugc external nofollow">Bryan D</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171700"><time datetime="2020-08-07T09:33:54-07:00">August 7, 2020 at 9:33 am</time></a> </div> </footer> <div class="comment-content"> <p>What if you train this stuff using memes ?</p> </div> </article> </li> <li id="comment-171699" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-171699" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCUAbEbr-pjRp8_Dxi9TvuRw" class="url" rel="ugc external nofollow">H.İ. Iskender</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171699"><time datetime="2020-08-19T13:18:29-07:00">August 19, 2020 at 1:18 pm</time></a> </div> </footer> <div class="comment-content"> <p>👏👏👏👏👏👏👍👌</p> </div> </article> </li> <li id="comment-171698" class="comment even thread-even depth-1"> <article id="div-comment-171698" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCAiyL0MCw-TVzsPZeXBJOOA" class="url" rel="ugc external nofollow">alan smithee</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/image-gpt-generative-pretraining-from-pixels-paper-explained/#comment-171698"><time datetime="2020-08-19T14:37:28-07:00">August 19, 2020 at 2:37 pm</time></a> </div> </footer> <div class="comment-content"> <p><a href="https://www.youtube.com/watch?v=YBlNQK0Ao6g&t=2m20s">2:20</a><br />The first one of the generated images is so cute.<br />I want it.</p> </div> </article> </li> </ol> <p class="no-comments">Comments are closed.</p> </div> </main> </div> <div class="col-md-3 px-lg-3 "> </div> </div> </div> </div> <footer id="colophon" class="site-footer"> <div class="container"> <div class="row"> <div class="col-md-12 text-center"> <div class="site-info"> <span> Powered By: <a href="https://wordpress.org/" target="_blank">WordPress</a> </span> <span class="sep"> | </span> <span> Theme: <a href="https://odiethemes.com/themes/magazinebook/" target="_blank">MagazineBook</a> By OdieThemes </span> </div> </div> </div> </div> </footer> </div> <script>(function(){var advanced_ads_ga_UID="UA-88163215-1",advanced_ads_ga_anonymIP=!!1;window.advanced_ads_check_adblocker=function(t){var n=[],e=null;function a(t){var n=window.requestAnimationFrame||window.mozRequestAnimationFrame||window.webkitRequestAnimationFrame||function(t){return setTimeout(t,16)};n.call(window,t)}return a((function(){var t=document.createElement("div");t.innerHTML=" ",t.setAttribute("class","ad_unit ad-unit text-ad text_ad pub_300x250"),t.setAttribute("style","width: 1px !important; height: 1px !important; position: absolute !important; left: 0px !important; top: 0px !important; overflow: hidden !important;"),document.body.appendChild(t),a((function(){var a,o,i=null===(a=(o=window).getComputedStyle)||void 0===a?void 0:a.call(o,t),d=null==i?void 0:i.getPropertyValue("-moz-binding");e=i&&"none"===i.getPropertyValue("display")||"string"==typeof d&&-1!==d.indexOf("about:");for(var c=0,r=n.length;c<r;c++)n[c](e);n=[]}))})),function(t){"undefined"==typeof advanced_ads_adblocker_test&&(e=!0),null!==e?t(e):n.push(t)}}(),function(){function t(t){this.UID=t,this.analyticsObject="function"==typeof gtag;var n=this;return this.count=function(){gtag("event","AdBlock",{event_category:"Advanced Ads",event_label:"Yes",non_interaction:!0,send_to:n.UID})},function(){if(!n.analyticsObject){var e=document.createElement("script");e.src="https://www.googletagmanager.com/gtag/js?id="+t,e.async=!0,document.body.appendChild(e),window.dataLayer=window.dataLayer||[],window.gtag=function(){dataLayer.push(arguments)},n.analyticsObject=!0,gtag("js",new Date)}var a={send_page_view:!1,transport_type:"beacon"};window.advanced_ads_ga_anonymIP&&(a.anonymize_ip=!0),gtag("config",t,a)}(),this}advanced_ads_check_adblocker((function(n){n&&new t(advanced_ads_ga_UID).count()}))}();})();</script><div style="clear:both;width:100%;text-align:center; font-size:11px; "><a target="_blank" title="WP2Social Auto Publish" href="https://xyzscripts.com/wordpress-plugins/facebook-auto-publish/compare" >WP2Social Auto Publish</a> Powered By : <a target="_blank" title="PHP Scripts & Programs" href="http://www.xyzscripts.com" >XYZScripts.com</a></div><script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/navigation.js?ver=1.0.9" id="magazinebook-navigation-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/skip-link-focus-fix.js?ver=1.0.9" id="magazinebook-skip-link-focus-fix-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/jquery.easy-ticker.js?ver=3.1.0" id="magazinebook-news-ticker-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/splide.min.js?ver=2.3.1" id="splide-js-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/theme.js?ver=1.0.9" id="magazinebook-theme-js-js"></script> <script>!function(){window.advanced_ads_ready_queue=window.advanced_ads_ready_queue||[],advanced_ads_ready_queue.push=window.advanced_ads_ready;for(var d=0,a=advanced_ads_ready_queue.length;d<a;d++)advanced_ads_ready(advanced_ads_ready_queue[d])}();</script> </body> </html>