Przemek Chojecki
The new paper from Google shows a 1.6 trillion parameter Transformer model: https://arxiv.org/pdf/2101.03961.pdf – the largest up to date.
VentureBeat coverage: https://venturebeat.com/2021/01/12/google-trained-a-trillion-parameter-ai-language-model/
*** Check out my Data Science Job course here: https://datasciencerush.thinkific.com/courses/data-science-job
#gpt3 #t5 #transformer
YOU LOOK INCREDIBLE 😭😭😭
Every company is trying to make the most intelligent model which will destroy everything in the end 😂😂
1.6 Trillion is already 9 times stronger than GPT-3. Its hard to imagine it being worth as 9 times as GPT-3, its crazy. GPT-3 is already capable of so much.
I would say GPT-3's IQ in completing text based on the previous prompt is around 120 IQ. Given that its able to answer any question imaginable. Ranging from events in certain movies to real life scientific queries. If i ask a random stranger about what happened in tv series Arrow season 2, hes not likely to know about it unless he watched the series. However GPT-3 understands the context and builds an appropriate answer. This GPT-3 knowlesge is flexible to any subject and to any task. Sometimes it creates non-sense but thats because the prompt itself is non-sense. The zero shot method forces it to predict the next sequence from a small number of text, which could be diverted to many ways. 10^number of parameters is the way I calculate what will be generated if no text was present and just hit the button generate.
If the 1.6 Trillion parameters is made, I would have to guess that its IQ will average around 150-175. Considering that its much much better than GPT-3. Notice that GPT-3 is almost 175 times stronger than GPT-2. That means the 1.6 trillion is 1575 times stronger than than GPT-2 to put that into perspective. We may see some signs of ascention or ascended intelligence far beyond our own when it comes to fitting patterns of text into initial context. If the image GPT-3 dall-e is 12 Billion parameters, then a 1.6 Trillion parameters is enough to produce a small 15 second video from text.
However most likely this transformer will be used for algorithmns in youtube, to find out each users preferences and maximize people's addiction to the platform. That is a scary though considering that google had just 8 billion parameters a year ago.
But lemme guess it doesn't have the capabilities as GPT-3 that could generalize into some different applications
Mixture of Experts models' parameters are not comparable to dense transformers like GPT-3
* Check out my Data Science Job course here: https://datasciencerush.thinkific.com/courses/data-science-job
Im not extremely well versed in AI but it's my understanding that a parameter is a connection between two nodes. Using analogy to relate this to connections (not neurons) in the brain, then Googles model is roughly 1/625 the complexity of the human brain.
Does this sound about right or am I talking out of my league?