Language Generation with OpenAI's GPT-2 in Python – The Engineering of Conscious Experience

James Briggs

Easy natural language generation with Transformers and PyTorch. We apply OpenAI’s GPT-2 model to generate text in just a few lines of Python code.

Language generation is one of those natural language tasks that can really produce an incredible feeling of awe at how far the fields of machine learning and artificial intelligence have come.

GPT-1, 2, and 3 are OpenAI’s top language models — well known for their ability to produce incredibly natural, coherent, and genuinely interesting language.

In this article, we will take a small snippet of text and learn how to feed that into a pre-trained GPT-2 model using PyTorch and Transformers to produce high-quality language generation in just eight lines of code. We cover:

PyTorch and Transformers
– Data
Building the Model
– Initialization
– Tokenization
– Generation
– Decoding
Results

Medium Article:
https://towardsdatascience.com/text-generation-with-python-and-gpt-2-1fecbff1635b

Friend Link (free access):
https://towardsdatascience.com/text-generation-with-python-and-gpt-2-1fecbff1635b?sk=930367d835f15abb4ef3164f7791e1b1

Thumbnail background by gustavo centurion on Unsplash
https://unsplash.com/photos/O6fs4ablxw8

<iframe></p> <p><a href="https://www.youtube.com/watch?v=YvVQgvAz9dY">Source</a></p> <div class="be1e40beae42d993bafb8643f4ddde8b" data-index="3" style="float: none; margin:10px 0 10px 0; text-align:center;"> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <ins class="adsbygoogle" style="display:block; text-align:center;" data-ad-layout="in-article" data-ad-format="fluid" data-ad-client="ca-pub-9244112244416304" data-ad-slot="4549240677"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <div style="font-size: 0px; height: 0px; line-height: 0px; margin: 0; padding: 0; clear: both;"></div> </div> </article> <div class="clearfix"></div> <ul class="default-theme-post-navigation"> <li class="theme-nav-previous"><a href="https://theengineeringofconsciousexperience.com/ai-will-kill-all-jobs-in-this-world-really/" rel="prev"><span class="meta-nav">←</span> AI will kill all jobs in this world, really?</a></li> <li class="theme-nav-next"><a href="https://theengineeringofconsciousexperience.com/my-name-is-gpt-3-and-i-approved-this-article/" rel="next">My Name Is GPT 3 and I Approved This Article <span class="meta-nav">→</span></a></li> </ul> <div class="clearfix"></div> <h3 class='comment-reply-title'>Similar Posts</h3> <div class="mb-related-posts mb-simple-featured-posts mb-simple-featured-posts-wrap row"> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/%e6%9c%80%e6%96%b0%e7%89%88chatgpt%e3%80%8cgpt-4v%e3%80%8d%e3%81%a7%e8%87%aa%e5%8b%95%e5%8c%96%e3%81%a7%e3%81%8d%e3%82%8b%e3%81%93%e3%81%a86%e9%81%b8/" aria-hidden="true" tabindex="-1"> <img width="501" height="300" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2023/11/1698840384_maxresdefault-501x300.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/%e6%9c%80%e6%96%b0%e7%89%88chatgpt%e3%80%8cgpt-4v%e3%80%8d%e3%81%a7%e8%87%aa%e5%8b%95%e5%8c%96%e3%81%a7%e3%81%8d%e3%82%8b%e3%81%93%e3%81%a86%e9%81%b8/" rel="bookmark">最新版ChatGPT「GPT-4V」で自動化できること6選</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/%e6%9c%80%e6%96%b0%e7%89%88chatgpt%e3%80%8cgpt-4v%e3%80%8d%e3%81%a7%e8%87%aa%e5%8b%95%e5%8c%96%e3%81%a7%e3%81%8d%e3%82%8b%e3%81%93%e3%81%a86%e9%81%b8/" rel="bookmark"><time class="entry-date published updated" datetime="2023-10-12T04:30:11-07:00">October 12, 2023</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/c1ca7232d5998e27609c0683765ecfd1/">ウェブ職TV</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/police-stories-15-%d1%81%d0%ba%d0%b2%d0%be%d0%b7%d0%bd%d1%8b%d0%b5-%d1%80%d0%b0%d0%bd%d0%b5%d0%bd%d0%b8%d1%8f-%d0%b1%d0%b8%d0%b7%d0%bd%d0%b5%d1%81-%d1%86%d0%b5%d0%bd%d1%82%d1%80-gpt/" aria-hidden="true" tabindex="-1"> <img width="501" height="282" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" srcset="https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault.jpg 1280w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault-300x169.jpg 300w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault-1024x576.jpg 1024w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault-768x432.jpg 768w, https://theengineeringofconsciousexperience.com/wp-content/uploads/2021/06/1623830358_maxresdefault-520x293.jpg 520w" sizes="auto, (max-width: 501px) 100vw, 501px" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/police-stories-15-%d1%81%d0%ba%d0%b2%d0%be%d0%b7%d0%bd%d1%8b%d0%b5-%d1%80%d0%b0%d0%bd%d0%b5%d0%bd%d0%b8%d1%8f-%d0%b1%d0%b8%d0%b7%d0%bd%d0%b5%d1%81-%d1%86%d0%b5%d0%bd%d1%82%d1%80-gpt/" rel="bookmark">Police Stories – [15] – СКВОЗНЫЕ РАНЕНИЯ – Бизнес-центр GPT</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/police-stories-15-%d1%81%d0%ba%d0%b2%d0%be%d0%b7%d0%bd%d1%8b%d0%b5-%d1%80%d0%b0%d0%bd%d0%b5%d0%bd%d0%b8%d1%8f-%d0%b1%d0%b8%d0%b7%d0%bd%d0%b5%d1%81-%d1%86%d0%b5%d0%bd%d1%82%d1%80-gpt/" rel="bookmark"><time class="entry-date published updated" datetime="2021-06-11T19:55:20-07:00">June 11, 2021</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/inscarible/">Inscarible</a></span></span> </div> </header> </article> <article class="mb-featured-article col-md-4 px-lg-3 post"> <a class="post-thumbnail" href="https://theengineeringofconsciousexperience.com/photoshops-latest-feature-generative-fill-photoshop-ai-adobe-adobefirefly-generativeart/" aria-hidden="true" tabindex="-1"> <img width="501" height="300" src="https://theengineeringofconsciousexperience.com/wp-content/uploads/2023/09/1695695546_maxresdefault-501x300.jpg" class="attachment-magazinebook-featured-image-medium size-magazinebook-featured-image-medium wp-post-image" alt="" decoding="async" loading="lazy" /> </a> <span class="cat-links"><a href="https://theengineeringofconsciousexperience.com/category/gpt-3/" rel="category tag">GPT 3</a></span> <header class="entry-header"> <h3 class="entry-title"><a href="https://theengineeringofconsciousexperience.com/photoshops-latest-feature-generative-fill-photoshop-ai-adobe-adobefirefly-generativeart/" rel="bookmark">Photoshop’s latest Feature: Generative Fill ! #photoshop #ai #adobe #adobefirefly #generativeart</a></h3> <div class="entry-meta"> <span class="posted-on"><i class="far fa-calendar-alt"></i><a href="https://theengineeringofconsciousexperience.com/photoshops-latest-feature-generative-fill-photoshop-ai-adobe-adobefirefly-generativeart/" rel="bookmark"><time class="entry-date published updated" datetime="2023-05-24T19:29:20-07:00">May 24, 2023</time></a></span><span class="byline"><i class="far fa-user-circle"></i><span class="author vcard"><a class="url fn n" href="https://theengineeringofconsciousexperience.com/author/wevid_media/">Wevid Media</a></span></span> </div> </header> </article> </div> <div id="comments" class="comments-area"> <h5 class="comments-title"> 8 thoughts on “<span>Language Generation with OpenAI's GPT-2 in Python</span>” </h5> <ol class="comment-list"> <li id="comment-232891" class="comment even thread-even depth-1"> <article id="div-comment-232891" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCZorZ8KWmTON7aO84JJBwlQ" class="url" rel="ugc external nofollow">Algorithmic Trading by Sajid</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232891"><time datetime="2020-11-30T16:26:20-07:00">November 30, 2020 at 4:26 pm</time></a> </div> </footer> <div class="comment-content"> <p>So cool James!</p> </div> </article> </li> <li id="comment-232890" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-232890" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCrXamSETbcKimjswPMCQDDw" class="url" rel="ugc external nofollow">Ya Suo</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232890"><time datetime="2020-12-18T04:16:27-07:00">December 18, 2020 at 4:16 am</time></a> </div> </footer> <div class="comment-content"> <p>Can you please do a complete tutorial from start to finish? Like how to set it up all those kinds of things cuz I'm new to this. Thanks</p> </div> </article> </li> <li id="comment-232889" class="comment even thread-even depth-1"> <article id="div-comment-232889" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCSPpyvZZmxqFNFQvaCVKw0A" class="url" rel="ugc external nofollow">magnumchicken</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232889"><time datetime="2021-01-09T12:57:31-07:00">January 9, 2021 at 12:57 pm</time></a> </div> </footer> <div class="comment-content"> <p>Do you think it’s possible to have text generation based on intents given to the system? I’ve got a speech to text, intent recognition, intent handler, and text to speech system going. The downside is that the text to speech service is just saying things I’ve coded it to say. Curious if/how GPT2 could generate text for the TTS service to say.</p> </div> </article> </li> <li id="comment-232888" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-232888" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCOfT5m_jiKpSXo535A7tGlg" class="url" rel="ugc external nofollow">pyIcarus</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232888"><time datetime="2021-01-28T01:43:23-07:00">January 28, 2021 at 1:43 am</time></a> </div> </footer> <div class="comment-content"> <p>Would you please show how to deploy this on GPU?</p> </div> </article> </li> <li id="comment-232887" class="comment even thread-even depth-1"> <article id="div-comment-232887" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCWsT_oXZjZ-YL-1slhHkiyA" class="url" rel="ugc external nofollow">Bei Zhou</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232887"><time datetime="2021-02-02T07:09:16-07:00">February 2, 2021 at 7:09 am</time></a> </div> </footer> <div class="comment-content"> <p>Could you make a video about the question answering using Bert? Thank you.</p> </div> </article> </li> <li id="comment-232886" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-232886" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCEHVuGatotzFa7EgaKjVSuQ" class="url" rel="ugc external nofollow">Cm April</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232886"><time datetime="2021-02-15T18:22:03-07:00">February 15, 2021 at 6:22 pm</time></a> </div> </footer> <div class="comment-content"> <p>Hey Thank You! This is super awesome, I'm going to reference video this in a medium post. I'm blogging my Data Science learning process!</p> </div> </article> </li> <li id="comment-232885" class="comment even thread-even depth-1"> <article id="div-comment-232885" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCuLCqxYJkzlJvUO-2q-OV5A" class="url" rel="ugc external nofollow">B Lo</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232885"><time datetime="2021-02-19T18:53:57-07:00">February 19, 2021 at 6:53 pm</time></a> </div> </footer> <div class="comment-content"> <p>What in the heck is happening "Downloading: 10% | 55.8M /548M 04: 03<450:0145 ect, ect, ect until it slows to a crawl and gives an error, "an existing connection was forcibly closed by the host" and "Make sure that gpt2 is one of the models on huggingface or gpt2 is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt" <br />I used the pytorch install guide you linked to in your article to verify my pip install was good and it checked out. Man, GPT2 is kicking my butt.</p> </div> </article> </li> <li id="comment-232884" class="comment odd alt thread-odd thread-alt depth-1"> <article id="div-comment-232884" class="comment-body"> <footer class="comment-meta"> <div class="comment-author vcard"> <b class="fn"><a href="https://www.youtube.com/channel/UCGGTk5DTxEk1lA4x7IybWQw" class="url" rel="ugc external nofollow">Japanese Home Cooking Challenge</a></b> <span class="says">says:</span> </div> <div class="comment-metadata"> <a href="https://theengineeringofconsciousexperience.com/language-generation-with-openais-gpt-2-in-python/#comment-232884"><time datetime="2021-03-17T06:44:28-07:00">March 17, 2021 at 6:44 am</time></a> </div> </footer> <div class="comment-content"> <p>What determines the upper limit of max_length? I tried 20000 but got "IndexError: index out of range in self". What's the highest value I can use?</p> </div> </article> </li> </ol> <p class="no-comments">Comments are closed.</p> </div> </main> </div> <div class="col-md-3 px-lg-3 "> </div> </div> </div> </div> <footer id="colophon" class="site-footer"> <div class="container"> <div class="row"> <div class="col-md-12 text-center"> <div class="site-info"> <span> Powered By: <a href="https://wordpress.org/" target="_blank">WordPress</a> </span> <span class="sep"> | </span> <span> Theme: <a href="https://odiethemes.com/themes/magazinebook/" target="_blank">MagazineBook</a> By OdieThemes </span> </div> </div> </div> </div> </footer> </div> <script>(function(){var advanced_ads_ga_UID="UA-88163215-1",advanced_ads_ga_anonymIP=!!1;window.advanced_ads_check_adblocker=function(t){var n=[],e=null;function a(t){var n=window.requestAnimationFrame||window.mozRequestAnimationFrame||window.webkitRequestAnimationFrame||function(t){return setTimeout(t,16)};n.call(window,t)}return a((function(){var t=document.createElement("div");t.innerHTML=" ",t.setAttribute("class","ad_unit ad-unit text-ad text_ad pub_300x250"),t.setAttribute("style","width: 1px !important; height: 1px !important; position: absolute !important; left: 0px !important; top: 0px !important; overflow: hidden !important;"),document.body.appendChild(t),a((function(){var a,o,i=null===(a=(o=window).getComputedStyle)||void 0===a?void 0:a.call(o,t),d=null==i?void 0:i.getPropertyValue("-moz-binding");e=i&&"none"===i.getPropertyValue("display")||"string"==typeof d&&-1!==d.indexOf("about:");for(var c=0,r=n.length;c<r;c++)n[c](e);n=[]}))})),function(t){"undefined"==typeof advanced_ads_adblocker_test&&(e=!0),null!==e?t(e):n.push(t)}}(),(()=>{function t(t){this.UID=t,this.analyticsObject="function"==typeof gtag;var n=this;return this.count=function(){gtag("event","AdBlock",{event_category:"Advanced Ads",event_label:"Yes",non_interaction:!0,send_to:n.UID})},function(){if(!n.analyticsObject){var e=document.createElement("script");e.src="https://www.googletagmanager.com/gtag/js?id="+t,e.async=!0,document.body.appendChild(e),window.dataLayer=window.dataLayer||[],window.gtag=function(){dataLayer.push(arguments)},n.analyticsObject=!0,gtag("js",new Date)}var a={send_page_view:!1,transport_type:"beacon"};window.advanced_ads_ga_anonymIP&&(a.anonymize_ip=!0),gtag("config",t,a)}(),this}advanced_ads_check_adblocker((function(n){n&&new t(advanced_ads_ga_UID).count()}))})();})();</script><div style="clear:both;width:100%;text-align:center; font-size:11px; "><a target="_blank" title="WP2Social Auto Publish" href="https://xyzscripts.com/wordpress-plugins/facebook-auto-publish/compare" >WP2Social Auto Publish</a> Powered By : <a target="_blank" title="PHP Scripts & Programs" href="http://www.xyzscripts.com" >XYZScripts.com</a></div><script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/navigation.js?ver=1.0.9" id="magazinebook-navigation-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/skip-link-focus-fix.js?ver=1.0.9" id="magazinebook-skip-link-focus-fix-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/jquery.easy-ticker.js?ver=3.1.0" id="magazinebook-news-ticker-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/splide.min.js?ver=2.3.1" id="splide-js-js"></script> <script type="text/javascript" src="https://theengineeringofconsciousexperience.com/wp-content/themes/magazinebook/js/theme.js?ver=1.0.9" id="magazinebook-theme-js-js"></script> <script>!function(){window.advanced_ads_ready_queue=window.advanced_ads_ready_queue||[],advanced_ads_ready_queue.push=window.advanced_ads_ready;for(var d=0,a=advanced_ads_ready_queue.length;d<a;d++)advanced_ads_ready(advanced_ads_ready_queue[d])}();</script> </body> </html>