AssemblyAI
DALL-E 2 has arrived in the AI world with a bang. It is one of the best generative models we have seen to date. But how does this magical model work? In this video, we will take a look into the architecture of DALL-E 2, to understand the working principles.
00:00 Overview
00:34 What can DALL-E 2 do?
00:55 Architecture overview
01:27 CLIP embeddings
03:05 The prior
04:24 Why do we need the prior?
05:20 The decoder
06:13 How are variations created?
06:56 Model evaluation
07:36 Limitations and risks of DALL-E 2
09:21 Benefits of DALL-E 2
10:00 A question for you!
Example images are from DALL-E 2 announcement blog post (https://openai.com/dall-e-2/) and the DALL-E 2 paper (https://cdn.openai.com/papers/dall-e-2.pdf)
The analysis of DALL-E 2 limitations and risks – https://github.com/openai/dalle-2-preview/blob/main/system-card.md
Would you like to read this information instead? Check out the blog post on DALL-E 2 👇
https://www.assemblyai.com/blog/how-dall-e-2-actually-works/
Get your Free Token for AssemblyAI Speech-To-Text API 👇https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_mis_28
▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬
🖥️ Website: https://www.assemblyai.com
🐦 Twitter: https://twitter.com/AssemblyAI
🦾 Discord: https://discord.gg/Cd8MyVJAXd
▶️ Subscribe: https://www.youtube.com/c/AssemblyAI?sub_confirmation=1
🔥 We’re hiring! Check our open roles: https://www.assemblyai.com/careers
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#MachineLearning #DeepLearning
Who owns the Copyright of the image?
Really liked the crisp and yet informative explanation. In particular, how it touches upon all the relevant technicalities.
that' s pretty impressive technology , this tech has a nice future ahead and the narrator is outsmartly explains pretty much everything
Dalle2 is named after dalle1 😎
wall-e?
F I'm still lost but not your fault, I'm just stupid, great video!
Amazing…
Also check out Stable Diffusion.
I would love to know more about how the diffusion model understands the "grammar" of an image… I understand how recreating an existing image could work (conceptually), but I cannot grasp how it can use diffusion to generate a meaningful (as in coherent) image that has never existed before.
Would someone recommend any further readings or videos on this, please?
10:00 salvador dalle and wall e pixar
so what is it named after?
Great video! Thanks for the clear explanation. Although I understand Open AI's desire to limit harm, I do hope they rethink their approach to nudity. My experience is that whatever training datasets they've used have made the AI very confused by things like penises and female nipples for example, even if they are not seen. Not all nudity is pornographic; there must be some way for it to be more nuanced about it and I hope they find it.
Thanks for the video. If the decoder is glide, what model is the Prior? Is it Vae based?
This was awesome, thank you
Actually, I was trying to figure out academic explanation for dall e and you explain it perfectly!Now I get the point
It must've been named after Salvador Dali!
an excellent lecturer, immeasurable gratitude for the knowledge you share /
sorry but the explanation is very poor
Yo is this chick AI generated?
Would it be possible to make realistic AI pics of Chrome balls? Like if the balls were in a Lab or professorially taken. What are your thoughts?
Thank you