Videos

How to Make Your Images Talk: The AI that Captions Any Image



Pritish Mishra

HuggingFace Web App: https://bit.ly/3SDyOWt

Image captioning is the process of taking an image and generating a caption that accurately describes the scene. This is a difficult task for neural networks because it requires understanding both natural language and computer vision.

In this video, I discuss my complete approach to this problem. For visual understanding, we will use Inception V3, and for natural language understanding, we will first use RNN, but it will fail to generalize well on unseen data, therefore we will shift to Transformer. And as you will see, Transformer will nail it!

Source Code:
Image Captioning with RNN: https://bit.ly/3SBPoGi
Image Captioning with Transformer: https://bit.ly/3HToJRC
Image Captioning (on MS COCO Dataset): https://bit.ly/40t2da9

🔗 Social Media 🔗
📱 Twitter: https://bit.ly/3aJWAeF​
📝 LinkedIn: https://bit.ly/3aQGGiL​
📂 GitHub: https://bit.ly/2QGLVYV​

Timestamps:
00:00 Introduction
00:16 Quick overview of Image Captioning
01:08 The Model Architecture (RNN)
01:56 Getting the Image feature vectors using Inception V3
04:39 What Attention Mechanism is doing?
05:10 Choosing the Dataset
05:56 Data Preprocessing
06:54 Training!!!
07:13 Checking the results
09:24 Over Dramatic Transformer Introduction
10:25 Why I used COCO Dataset
11:12 Side-by-side result of RNN and Transformer
11:59 Deploying model to HuggingFace so anyone can use it!

#artificialintelligence #ai #deeplearning #machinelearning #transformer #transformers

Thank You,
Pritish Mishra

Source

Similar Posts

49 thoughts on “How to Make Your Images Talk: The AI that Captions Any Image
  1. Brother this video is really great and i loved your explanation but i am a beginner in aiml and want to learn this in detail
    Can you please create a detail video on this topic

  2. i want to do the image captioning with unsupervised or semi supervised bro if you have any reference code or implemented code if you share
    it will be helpful to me

  3. ModuleNotFoundError: No module named 'tensorflow' i got this error in hugging face while building app how to install tensorflow in hugging face @PritishMishra

  4. Hey, great lecture! Just need a help, the link for the google colab for image captioning with rnn isn't working. It would be great help if you'll provide a new link. Thankyou!!

  5. Hi Pritish, amazing tutorials. Thank you. While running the transformers colab book getting error at –
    —-> 4 pred_caption = generate_caption(img_path). TypeError: `x` and `y` must have the same dtype, got tf.uint8 != tf.float32.

    Can you please help!

  6. Bro i have one doubt
    the second link that you have given is using transformer , i wanted to ask if it is trained on coco or flikcr , also 3rd link that you have given ((on MS COCO Dataset))in that you have used rnn or transformer … also which is the best??

    Pls reply … thankyou ❤

Comments are closed.

WP2Social Auto Publish Powered By : XYZScripts.com