this is very useful. Just wanted to add that the gpt decoder doesn't have the cross attention in the transformer block.
One is for Natural language understanding and another is for Natural language generation
Great explanation. For eg, if I have to read all the client emails and understand their requirements and auto create tasks based on that prediction, which model should I go for? BERT or GPT?
Awesome 👏
I love you ❤
Bert also Drives a trans am!
Jokes aside I do appreciate your videos!
What if I stack both encoders and decoders? Do I get some BERTGPT hybrid?
good
So BERT doesn’t have a decoder? Did I misunderstand
transformer models are usually parallelly run right?
can't read this scheme, unfathomable. arrow goes inwards to both inputs and outputs? looks like horseshit to me.
this is very useful. Just wanted to add that the gpt decoder doesn't have the cross attention in the transformer block.
One is for Natural language understanding and another is for Natural language generation
Great explanation. For eg, if I have to read all the client emails and understand their requirements and auto create tasks based on that prediction, which model should I go for? BERT or GPT?
Awesome 👏
I love you ❤
Bert also Drives a trans am!
Jokes aside I do appreciate your videos!
What if I stack both encoders and decoders? Do I get some BERTGPT hybrid?
good
So BERT doesn’t have a decoder? Did I misunderstand
transformer models are usually parallelly run right?
can't read this scheme, unfathomable. arrow goes inwards to both inputs and outputs? looks like horseshit to me.