OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

AI Coffee Break with Letitia

How can GPT-3 create an avocado armchair? Have a look at DALL·E, OpenAI’s new amazing text-to-image generator. Video with a high-level explanation of how can it be this good and why?

📄 DALL-E blog, not a paper (yet): https://openai.com/blog/dall-e/ Play around with many input combinations! This is impressive.

📺 Ms. Coffee Bean’s GPT-3 video: https://youtu.be/5fqxPOaaqi0

Outline:
* 00:00 DALL-E is here
* 02:26 How can it work?
* 04:00 Why does it work?
* 05:36 OpenAI is showing off 😉
* 08:25 Multimodality

📄 Image-GPT: Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., & Sutskever, I. (2020, November). Generative pretraining from pixels. In International Conference on Machine Learning (pp. 1691-1703). PMLR. http://proceedings.mlr.press/v119/chen20s/chen20s.pdf

📄 StackGAN++: Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, D. N. (2018). Stackgan++: Realistic image synthesis with stacked generative adversarial networks. IEEE transactions on pattern analysis and machine intelligence, 41(8), 1947-1962. https://arxiv.org/pdf/1710.10916v3.pdf

📄 StyleGAN2: Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8110-8119). https://arxiv.org/pdf/1912.04958.pdf

🔗 Links:
YouTube: https://www.youtube.com/AICoffeeBreak
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/

#AICoffeeBreak #MsCoffeeBean #OpenAI #DALL-E #MachineLearning #AI #research

<iframe></p> <p><a href="https://www.youtube.com/watch?v=mvG2FGF0TvM">Source</a></p> <div class="be1e40beae42d993bafb8643f4ddde8b" data-index="3" style="float: none; margin:10px 0 10px 0; text-align:center;"> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <ins class="adsbygoogle" style="display:block; text-align:center;" data-ad-layout="in-article" data-ad-format="fluid" data-ad-client="ca-pub-9244112244416304" data-ad-slot="4549240677"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <div style="font-size: 0px; height: 0px; line-height: 0px; margin: 0; padding: 0; clear: both;"></div> </div> </article> <div class="clearfix"></div> <ul class="default-theme-post-navigation"> <li class="theme-nav-previous"><a href="https://theengineeringofconsciousexperience.com/the-tech-that-will-invade-our-lives-in-2021/" rel="prev"><span class="meta-nav">←</span> The Tech That Will Invade Our Lives in 2021</a></li> <li class="theme-nav-next"><a href="https://theengineeringofconsciousexperience.com/coachella-20-years-in-the-desert/" rel="next">Coachella: 20 Years in the Desert <span class="meta-nav">→</span></a></li> </ul> <div class="clearfix"></div> <h3 class='comment-reply-title'>Similar Posts</h3> <div class="mb-related-posts mb-simple-featured-posts mb-simple-featured-posts-wrap row"> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/gpt-3-debuild-co/" aria-hidden="true" tabindex="-1"> <img width="400" height="300" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1599165802_hqdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1599165802_hqdefault.jpg 480w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2020/09/1599165802_hqdefault-300x225.jpg 300w" sizes="(max-width: 400px) 100vw, 400px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/gpt-3-debuild-co/" rel="bookmark">GPT-3 debuild.co</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/gpt-3-debuild-co/" rel="bookmark"><time class="entry-date published updated" datetime="2020-07-21T07:46:47-07:00">July 21, 2020</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/799397979480a332c8739aa99b034136/">sitraka rakotoniaina</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/viewers-suggest-wacky-commands-for-gpt-3-to-interpret-and-also-an-explanation-of-how-it-works/" aria-hidden="true" tabindex="-1"> <img width="501" height="282" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault.jpg 1280w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault-300x169.jpg 300w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault-1024x576.jpg 1024w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault-768x432.jpg 768w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/05/1621166160_maxresdefault-520x293.jpg 520w" sizes="(max-width: 501px) 100vw, 501px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/viewers-suggest-wacky-commands-for-gpt-3-to-interpret-and-also-an-explanation-of-how-it-works/" rel="bookmark">Viewers suggest wacky commands for GPT-3 to interpret, and also an explanation of how it works</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/viewers-suggest-wacky-commands-for-gpt-3-to-interpret-and-also-an-explanation-of-how-it-works/" rel="bookmark"><time class="entry-date published updated" datetime="2021-04-20T12:01:50-07:00">April 20, 2021</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/a2ac58fdf13f0ec0a2eecc74a5746804/">River's Educational Channel</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/midjourney-prompt-tips-for-more-precise-control/" aria-hidden="true" tabindex="-1"> <img width="501" height="300" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2023/11/1698847615_maxresdefault-501x300.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/midjourney-prompt-tips-for-more-precise-control/" rel="bookmark">Midjourney PROMPT TIPS for more precise control.</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/midjourney-prompt-tips-for-more-precise-control/" rel="bookmark"><time class="entry-date published updated" datetime="2023-10-04T21:30:13-07:00">October 4, 2023</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/3175e0b13639714f044e7e442dff7283/">Wade McMaster - Creator Impact</a></span></span> </div> </header> </article> </div> <div id="comments" class="comments-area"> <h5 class="comments-title"> 11 thoughts on “<span>OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.</span>” </h5> <ol class="comment-list"> <li id="comment-233870" class="comment even thread-even depth-1"> <article id="div-comment-233870" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCTMa_vodYeuaNbdn2P-laew" class="url" rel="ugc external nofollow">DerPylz</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233870"><time datetime="2021-01-06T10:51:26-07:00">January 6, 2021 at 10:51 am</time></a> </div> </footer> <div class="comment-content"> <p>I love the avocado chair!</p> </div> </article> </li> <li id="comment-233869" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-233869" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCgwU6FaiiU7_BIJkHwsaD8g" class="url" rel="ugc external nofollow">Hiram Coria Rodriguez</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233869"><time datetime="2021-01-06T12:00:56-07:00">January 6, 2021 at 12:00 pm</time></a> </div> </footer> <div class="comment-content"> <p>OpenAI should be renamed as CloseAI because this kind of models aren't open to the general people.</p> </div> </article> </li> <li id="comment-233868" class="comment even thread-even depth-1"> <article id="div-comment-233868" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCFI1yPaslXEm0ockXjlAKsA" class="url" rel="ugc external nofollow">Charlie You</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233868"><time datetime="2021-01-06T12:47:44-07:00">January 6, 2021 at 12:47 pm</time></a> </div> </footer> <div class="comment-content"> <p>Too bad they didn't include coffee beans as something you can generate an emoji for :p</p> </div> </article> </li> <li id="comment-233867" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-233867" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCiQKw1XsXfRj04_JB8epRrg" class="url" rel="ugc external nofollow">Roman Riesen</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233867"><time datetime="2021-01-06T14:09:01-07:00">January 6, 2021 at 2:09 pm</time></a> </div> </footer> <div class="comment-content"> <p><a href="https://www.youtube.com/watch?v=mvG2FGF0TvM&t=1m25s">1:25</a> a multi-dollar company? damn. I for sure thought such exclusivity rights would at least be worth 10 times that!</p> <p>(Sorry, couldn't resist. As always great content! Am in the process of watching all your stuff.)</p> </div> </article> </li> <li id="comment-233866" class="comment even thread-even depth-1"> <article id="div-comment-233866" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCf-eKYhtBxHC0QD3-It4H3Q" class="url" rel="ugc external nofollow">Nasib Ullah</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233866"><time datetime="2021-01-06T16:16:12-07:00">January 6, 2021 at 4:16 pm</time></a> </div> </footer> <div class="comment-content"> <p>Great video again.</p> </div> </article> </li> <li id="comment-233865" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-233865" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCL6ggvYlL15PDyw0EpmlAfw" class="url" rel="ugc external nofollow">Amit Kumar Jena</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233865"><time datetime="2021-01-08T09:30:30-07:00">January 8, 2021 at 9:30 am</time></a> </div> </footer> <div class="comment-content"> <p>Nice explanation!</p> </div> </article> </li> <li id="comment-233864" class="comment even thread-even depth-1"> <article id="div-comment-233864" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCzfzaaoy-Iha5uZ6M2TGM5A" class="url" rel="ugc external nofollow">Bismarck bamfo</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233864"><time datetime="2021-01-09T09:42:21-07:00">January 9, 2021 at 9:42 am</time></a> </div> </footer> <div class="comment-content"> <p>The model with trained with 400M image-text pairs.</p> </div> </article> </li> <li id="comment-233863" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-233863" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCeYx7BlJaDGlF7dv6M4G8ZA" class="url" rel="ugc external nofollow">Prayer Closet</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233863"><time datetime="2021-01-16T09:51:57-07:00">January 16, 2021 at 9:51 am</time></a> </div> </footer> <div class="comment-content"> <p>Well put. ✌️</p> </div> </article> </li> <li id="comment-233862" class="comment even thread-even depth-1"> <article id="div-comment-233862" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UChqcrfbITk-r-fJ50Shr1Hw" class="url" rel="ugc external nofollow">Il Zhukov</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233862"><time datetime="2021-01-31T23:33:49-07:00">January 31, 2021 at 11:33 pm</time></a> </div> </footer> <div class="comment-content"> <p>Hello! how can we use it on personal pc? Where is the soft?</p> </div> </article> </li> <li id="comment-233861" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-233861" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ" class="url" rel="ugc external nofollow">Aladdin Persson</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233861"><time datetime="2021-02-26T16:06:36-07:00">February 26, 2021 at 4:06 pm</time></a> </div> </footer> <div class="comment-content"> <p>I like your channel, it's like two minute papers but a couple of minutes longer 🙂</p> </div> </article> </li> <li id="comment-233860" class="comment even thread-even depth-1"> <article id="div-comment-233860" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCWbLOxrH-3EZL8n8hsh73Rg" class="url" rel="ugc external nofollow">Simona Maria</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/openais-dall-e-explained-how-gpt-3-creates-images-from-descriptions/#comment-233860"><time datetime="2021-03-04T04:37:57-07:00">March 4, 2021 at 4:37 am</time></a> </div> </footer> <div class="comment-content"> <p>Cool video, once again! When you made the video the paper was not published yet. I think they now state they created a dataset of 250M text-images pairs from internet, which doesn't include MS-COCO.</p> </div> </article> </li> </ol> <p class="no-comments">Comments are closed.</p> </div> </main> </div> <div class="col-md-3 px-lg-3 "> </div> </div> </div> </div> <footer id="colophon" class="site-footer"> <div class="container"> <div class="row"> <div class="col-md-12 text-center"> <div class="site-info"> <span> Powered By: <a href="https://wordpress.org/" target="_blank">WordPress</a> </span> <span class="sep"> | </span> <span> Theme: <a href="https://odiethemes.com/themes/magazinebook/" target="_blank">MagazineBook</a> By OdieThemes </span> </div> </div> </div> </div> </footer> </div> <script>(function(){var advanced_ads_ga_UID="UA-88163215-1",advanced_ads_ga_anonymIP=!!1;window.advanced_ads_check_adblocker=function(t){var n=[],e=null;function a(t){var n=window.requestAnimationFrame||window.mozRequestAnimationFrame||window.webkitRequestAnimationFrame||function(t){return setTimeout(t,16)};n.call(window,t)}return a((function(){var t=document.createElement("div");t.innerHTML=" ",t.setAttribute("class","ad_unit ad-unit text-ad text_ad pub_300x250"),t.setAttribute("style","width: 1px !important; height: 1px !important; position: absolute !important; left: 0px !important; top: 0px !important; overflow: hidden !important;"),document.body.appendChild(t),a((function(){var a,o,i=null===(a=(o=window).getComputedStyle)||void 0===a?void 0:a.call(o,t),d=null==i?void 0:i.getPropertyValue("-moz-binding");e=i&&"none"===i.getPropertyValue("display")||"string"==typeof d&&-1!==d.indexOf("about:");for(var c=0,r=n.length;c<r;c++)n[c](e);n=[]}))})),function(t){"undefined"==typeof advanced_ads_adblocker_test&&(e=!0),null!==e?t(e):n.push(t)}}(),(()=>{function t(t){this.UID=t,this.analyticsObject="function"==typeof gtag;var n=this;return this.count=function(){gtag("event","AdBlock",{event_category:"Advanced Ads",event_label:"Yes",non_interaction:!0,send_to:n.UID})},function(){if(!n.analyticsObject){var e=document.createElement("script");e.src="https://www.googletagmanager.com/gtag/js?id="+t,e.async=!0,document.body.appendChild(e),window.dataLayer=window.dataLayer||[],window.gtag=function(){dataLayer.push(arguments)},n.analyticsObject=!0,gtag("js",new Date)}var a={send_page_view:!1,transport_type:"beacon"};window.advanced_ads_ga_anonymIP&&(a.anonymize_ip=!0),gtag("config",t,a)}(),this}advanced_ads_check_adblocker((function(n){n&&new t(advanced_ads_ga_UID).count()}))})();})();</script><div style="clear:both;width:100%;text-align:center; font-size:11px; "><a target="_blank" title="WP2Social Auto Publish" href="https://xyzscripts.com/wordpress-plugins/facebook-auto-publish/compare" >WP2Social Auto Publish</a> Powered By : <a target="_blank" title="PHP Scripts & Programs" href="http://www.xyzscripts.com" >XYZScripts.com</a></div><script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/navigation.js?ver=1.0.9" id="magazinebook-navigation-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/skip-link-focus-fix.js?ver=1.0.9" id="magazinebook-skip-link-focus-fix-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/jquery.easy-ticker.js?ver=3.1.0" id="magazinebook-news-ticker-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/splide.min.js?ver=2.3.1" id="splide-js-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/theme.js?ver=1.0.9" id="magazinebook-theme-js-js"></script> <script>!function(){window.advanced_ads_ready_queue=window.advanced_ads_ready_queue||[],advanced_ads_ready_queue.push=window.advanced_ads_ready;for(var d=0,a=advanced_ads_ready_queue.length;d<a;d++)advanced_ads_ready(advanced_ads_ready_queue[d])}();</script> </body> </html>