James Briggs
Easy natural language generation with Transformers and PyTorch. We apply OpenAI’s GPT-2 model to generate text in just a few lines of Python code.
Language generation is one of those natural language tasks that can really produce an incredible feeling of awe at how far the fields of machine learning and artificial intelligence have come.
GPT-1, 2, and 3 are OpenAI’s top language models — well known for their ability to produce incredibly natural, coherent, and genuinely interesting language.
In this article, we will take a small snippet of text and learn how to feed that into a pre-trained GPT-2 model using PyTorch and Transformers to produce high-quality language generation in just eight lines of code. We cover:
PyTorch and Transformers
– Data
Building the Model
– Initialization
– Tokenization
– Generation
– Decoding
Results
Medium Article:
https://towardsdatascience.com/text-generation-with-python-and-gpt-2-1fecbff1635b
Friend Link (free access):
https://towardsdatascience.com/text-generation-with-python-and-gpt-2-1fecbff1635b?sk=930367d835f15abb4ef3164f7791e1b1
Thumbnail background by gustavo centurion on Unsplash
https://unsplash.com/photos/O6fs4ablxw8
So cool James!
Can you please do a complete tutorial from start to finish? Like how to set it up all those kinds of things cuz I'm new to this. Thanks
Do you think it’s possible to have text generation based on intents given to the system? I’ve got a speech to text, intent recognition, intent handler, and text to speech system going. The downside is that the text to speech service is just saying things I’ve coded it to say. Curious if/how GPT2 could generate text for the TTS service to say.
Would you please show how to deploy this on GPU?
Could you make a video about the question answering using Bert? Thank you.
Hey Thank You! This is super awesome, I'm going to reference video this in a medium post. I'm blogging my Data Science learning process!
What in the heck is happening "Downloading: 10% | 55.8M /548M 04: 03<450:0145 ect, ect, ect until it slows to a crawl and gives an error, "an existing connection was forcibly closed by the host" and "Make sure that gpt2 is one of the models on huggingface or gpt2 is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt"
I used the pytorch install guide you linked to in your article to verify my pip install was good and it checked out. Man, GPT2 is kicking my butt.
What determines the upper limit of max_length? I tried 20000 but got "IndexError: index out of range in self". What's the highest value I can use?