Pritish Mishra
HuggingFace Web App: https://bit.ly/3SDyOWt
Image captioning is the process of taking an image and generating a caption that accurately describes the scene. This is a difficult task for neural networks because it requires understanding both natural language and computer vision.
In this video, I discuss my complete approach to this problem. For visual understanding, we will use Inception V3, and for natural language understanding, we will first use RNN, but it will fail to generalize well on unseen data, therefore we will shift to Transformer. And as you will see, Transformer will nail it!
Source Code:
Image Captioning with RNN: https://bit.ly/3SBPoGi
Image Captioning with Transformer: https://bit.ly/3HToJRC
Image Captioning (on MS COCO Dataset): https://bit.ly/40t2da9
🔗 Social Media 🔗
📱 Twitter: https://bit.ly/3aJWAeF
📝 LinkedIn: https://bit.ly/3aQGGiL
📂 GitHub: https://bit.ly/2QGLVYV
Timestamps:
00:00 Introduction
00:16 Quick overview of Image Captioning
01:08 The Model Architecture (RNN)
01:56 Getting the Image feature vectors using Inception V3
04:39 What Attention Mechanism is doing?
05:10 Choosing the Dataset
05:56 Data Preprocessing
06:54 Training!!!
07:13 Checking the results
09:24 Over Dramatic Transformer Introduction
10:25 Why I used COCO Dataset
11:12 Side-by-side result of RNN and Transformer
11:59 Deploying model to HuggingFace so anyone can use it!
#artificialintelligence #ai #deeplearning #machinelearning #transformer #transformers
Thank You,
Pritish Mishra
Source
Here's how I created a search engine for books using GPT3: https://youtu.be/SXFP4nHAWN8
Hi Pritish, nice video! The first source code for image captioning with RNN doesn't work (https://drive.google.com/file/u/0/d/1-yWKVUs_zlAcS4S1Epgk8pYORWjryra-/), could you maybe provide an updated link? We are really interested in the preprocessing for a similar project. Thanks beforehand!
Bro the Image Captioning with RNN source code is not available
I have a problem in caption key and image signature can pls help me in it
RNN file does not exist bro pls upload
Brother this video is really great and i loved your explanation but i am a beginner in aiml and want to learn this in detail
Can you please create a detail video on this topic
Hi Pritish, Is it possible to use your model's results using web API calls?
you nailed it bro
Nice video. How long does it take you to train the transformer model?
i want to do the image captioning with unsupervised or semi supervised bro if you have any reference code or implemented code if you share
it will be helpful to me
ModuleNotFoundError: No module named 'tensorflow' i got this error in hugging face while building app how to install tensorflow in hugging face @PritishMishra
bro unable to get the dataset brooo
Bro unable to get , Image caption using RNN. The link is not working. Can you please check.
How to get the code
The RNN source code link is not working please provide a link
Image captioning with RNN source code is not opening dude please upload 😊.
DUDE Please re-upload it RRN SOURCE CODE .
How can we do it for videos bro ??
goog one buddy
bro can you help me in Video captioning project?
Can you share the link for pretraiend model ( h5 ) .please share it
How to use the saved model weights model.h5 in another file to make inferences on new images
Amazing video, where did you learn all of this? omg just saved me so much time. Life safer
Hi man can you help me out? What is the captions.txt file? is it the Flickr9k.token.txt?
ur github link is saying that it is suspended
Github link is not opening , it's says that it was uploaded from a suspended account
Link of Images Captioning with RNN was dead, Can you update it to help me. Thank you. From VietNam with love <3
Very nice explanation
hey none of your links are working
Awesome Video bro !! You explained Image captioning in a simple and fun way.
Your RNN file is showing Page Not Found , can you reupload the file
Hey, great lecture! Just need a help, the link for the google colab for image captioning with rnn isn't working. It would be great help if you'll provide a new link. Thankyou!!
Source code link with RNN not working😢😢
Hi Pritish, amazing tutorials. Thank you. While running the transformers colab book getting error at –
—-> 4 pred_caption = generate_caption(img_path). TypeError: `x` and `y` must have the same dtype, got tf.uint8 != tf.float32.
Can you please help!
bro your source code link is not working
Image captioning With RNN code isn’t available,could you please solve it :/
that was fucking amazing
Can we use CNN + LSTM ? for better image feature extraction and structured answers rather than RNN ?
Absolute cinema
i am finding difficulty to access codes through github link
where can i get the codes from???
Bro i have one doubt
the second link that you have given is using transformer , i wanted to ask if it is trained on coco or flikcr , also 3rd link that you have given ((on MS COCO Dataset))in that you have used rnn or transformer … also which is the best??
Pls reply … thankyou ❤
Bro can you provide a drive link for your saved models because your LFS bandwidth is exceeded in GitHub. Please
Can anyone share the saved_models with me in a drive link?
Pls update link for rnn
very helpful …. Great work
bro streamlit web page code send kardo please. anyone , who have the code for streamlit web page, please send me
no one is helping, how to label image and wirte image capiton in json file or a CSV file for image captioning please help on tht,
bro can you provide a github repository for it