GPT 3

Fine Tuning GPT-3 & Chatgpt Transformers: Using OpenAI Whisper



Lucidate

Fine-tuning GPT-3.
In this video we demonstrate how to use APIs and audio to create prompts and completions that can be used to fine-tune Transformers such as GPT-3. We show how to use a News API with Python to extract news articles and create a dataset for training and fine-tuning models.

We explain what prompt and completion datasets are, and the role they play in natural language processing in fine-tuning transformers. We then discuss using video and audio streams to build data pipelines for fine-tuning and transcribing speech to text using OpenAI’s Whisper.

With over 500 hours of YouTube videos being uploaded every minute video is a rich source of detailed information with which to train AI.

The video first includes step-by-step demonstrations of using the News API to create a prompt and completion dataset and then shows how to build pipelines transcribing speech to text using Whisper.

Foundation models like GPT-3 can be ‘Fine-Tuned’. This Fine_tuning allows them to learn the vocabulary and semantics of particular disciplines. If you want your AI solution to be familiar with options pricing algorithms, then you can show it lots of options pricing papers. If you’d like your AI to be able to converse and write about monetary policy then you can expose it to policy papers, books on economic theory, central bank meeting minutes and news conferences.

You train transformer AI models in a particular way. You don’t simply feed in a file containing the text of the book or article that you are using to train the AI. You break the text up into a series of “Prompts” and “Completions”. Prompts are the beginning of a passage of text and completions are what follow a prompt.

In this post we leverage the breadth and depth of content on YouTube. We propose a strategy to use selected videos to train our AI language models. These videos are chosen because they are rich in expertise in an area we want to train our language model in. The example used in this video is to train our AI in monetary policy by studying content generated by the Federal Reserve. We devise a strategy to use speech to text transcription of YouTube videos. We use OpenAI’s ‘Whisper’ to perform this transcription. We then break this text up into ‘prompts’ and completions to train our model.

Following this approach we can easily train our NLP models to have expertise in any field we choose.

#NLP #Python #FineTuning #AI #MachineLearning #NewsAPI #Whisper #Lucidate #chatgpt #gpt3 #gpt-3 #openai #whisper #openai-whisper

Links:
Attention Mechanism in Transformers: https://youtu.be/sznZ78HquPc Transformer playlist: https://youtube.com/playlist?list=PLaJCKi8Nk1hwaMUYxJMiM3jTB2o58A6WY
Federal reserve Description: https://youtu.be/lJTmCv2kKKs
FOMC Feb 2023 Press Conference: https://www.youtube.com/watch?v=3Iv4aCN0OOo&t=2s
Python code for Transcriber class: https://github.com/mrspiggot/prompts_and_completions/blob/master/transcriber.py

=========================================================================
Link to introductory series on Neural networks:
Lucidate website: https://www.lucidate.co.uk/blog/categ

YouTube: https://www.youtube.com/playlist?list

Link to intro video on ‘Backpropagation’:
Lucidate website: https://www.lucidate.co.uk/post/intro

YouTube: https://youtu.be/8UZgTNxuKzY

‘Attention is all you need’ paper – https://arxiv.org/pdf/1706.03762.pdf

=========================================================================
Transformers are a type of artificial intelligence (AI) used for natural language processing (NLP) tasks, such as translation and summarisation. They were introduced in 2017 by Google researchers, who sought to address the limitations of recurrent neural networks (RNNs), which had traditionally been used for NLP tasks. RNNs had difficulty parallelizing, and tended to suffer from the vanishing/exploding gradient problem, making it difficult to train them with long input sequences.

Transformers address these limitations by using self-attention, a mechanism which allows the model to selectively choose which parts of the input to pay attention to. This makes the model much easier to parallelize and eliminates the vanishing/exploding gradient problem.

Self-attention works by weighting the importance of different parts of the input, allowing the AI to focus on the most relevant information and better handle input sequences of varying lengths. This is accomplished through three matrices: Query (Q), Key (K) and Value (V). The Query matrix can be interpreted as the word for which attention is being calculated, while the Key matrix can be interpreted as the word to which attention is paid. The eigenvalues and eigenvectors of these matrices tend to be similar, and the product of these two matrices gives the attention score.

=========================================================================

#ai #deeplearning #chatgpt #gpt3 #neuralnetworks #attention #attentionisallyouneed