AI Coffee Break with Letitia
How can GPT-3 create an avocado armchair? Have a look at DALL·E, OpenAI’s new amazing text-to-image generator. Video with a high-level explanation of how can it be this good and why?
📄 DALL-E blog, not a paper (yet): https://openai.com/blog/dall-e/ Play around with many input combinations! This is impressive.
📺 Ms. Coffee Bean’s GPT-3 video: https://youtu.be/5fqxPOaaqi0
Outline:
* 00:00 DALL-E is here
* 02:26 How can it work?
* 04:00 Why does it work?
* 05:36 OpenAI is showing off 😉
* 08:25 Multimodality
📄 Image-GPT: Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., & Sutskever, I. (2020, November). Generative pretraining from pixels. In International Conference on Machine Learning (pp. 1691-1703). PMLR. http://proceedings.mlr.press/v119/chen20s/chen20s.pdf
📄 StackGAN++: Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., & Metaxas, D. N. (2018). Stackgan++: Realistic image synthesis with stacked generative adversarial networks. IEEE transactions on pattern analysis and machine intelligence, 41(8), 1947-1962. https://arxiv.org/pdf/1710.10916v3.pdf
📄 StyleGAN2: Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., & Aila, T. (2020). Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8110-8119). https://arxiv.org/pdf/1912.04958.pdf
🔗 Links:
YouTube: https://www.youtube.com/AICoffeeBreak
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
#AICoffeeBreak #MsCoffeeBean #OpenAI #DALL-E #MachineLearning #AI #research
I love the avocado chair!
OpenAI should be renamed as CloseAI because this kind of models aren't open to the general people.
Too bad they didn't include coffee beans as something you can generate an emoji for :p
1:25 a multi-dollar company? damn. I for sure thought such exclusivity rights would at least be worth 10 times that!
(Sorry, couldn't resist. As always great content! Am in the process of watching all your stuff.)
Great video again.
Nice explanation!
The model with trained with 400M image-text pairs.
Well put. ✌️
Hello! how can we use it on personal pc? Where is the soft?
I like your channel, it's like two minute papers but a couple of minutes longer 🙂
Cool video, once again! When you made the video the paper was not published yet. I think they now state they created a dataset of 250M text-images pairs from internet, which doesn't include MS-COCO.