Blake
In this video, I go over how to download and run the open-source implementation of GPT3, called GPT Neo. This model is 2.7 billion parameters, which is the same size as GPT3 Ada. The results are very good and are a large improvement over GPT-2. I am excited to play around with this model more and for the future of even larger NLP models.
Notebook https://github.com/mallorbc/GPTNeo_notebook
GPT Neo github https://github.com/EleutherAI/gpt-neo (use the first release tag)
GPT Neo HuggingFace docs https://huggingface.co/transformers/model_doc/gpt_neo.html
A useful article about transformer parameters https://huggingface.co/blog/how-to-generate
00:00 – GPT3 Background
01:07 – GPT3 Interview
02:06 – GPT Neo Github
03:14 – GPT Neo HuggingFace
03:52 – Setting up Anaconda and Jupyter
05:05 – Starting the Jupyter notebook
06:14 – Installing dependencies in the notebook
06:58 – Importing needed dependencies
07:33 – Selecting what GPT model to use
08:45 – Checking our computer hardware
09:36 – Loading the tokenizer
09:55 – Giving our inputs
10:39 – Generating the tokens with the model
11:42 – Decoding and reading the result
13:22 – Reflections on Transformers
14:26 – Outro questions future work
Notebook can be found here: https://github.com/mallorbc/GPTNeo_notebook
How can it also run without gpu? Do they run it on their servers and give you the result? or does it run on your cpu? and if it runs on your cpu how much slower is it?
Why not on Colab?
Very helpful! I finally managed to get it running on my machine!
No love for AMD cards, though. Still very hard to install pytorch with ROCm, hopefully AMD will fix this soon. But I managed to get it working on CPU, so it's fine.