Yannic Kilcher
#gpt4 #mit #ai
A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper…
OUTLINE:
0:00 – ChatGPT gives out Windows 10 keys
0:30 – MIT exam paper
2:50 – Prompt engineering
5:30 – Automatic grading
6:45 – Response by other MIT students
8:30 – Unsolvable questions
10:50 – Duplicates
13:30 – Cascading the heuristics
22:40 – Other problems
29:25 – OpenLLaMA 13B published
References:
https://twitter.com/immasiddtweets/status/1669721470006857729/photo/1
https://arxiv.org/abs/2306.08997
https://arxiv.org/pdf/2306.08997.pdf
https://flower-nutria-41d.notion.site/No-GPT4-can-t-ace-MIT-b27e6796ab5a48368127a98216c76864
https://github.com/idrori/MITQ/commit/3feee1026318e537c0ad27968001ef76e4a36890
https://twitter.com/hardmaru/status/1670246674760077312
https://twitter.com/giffmana/status/1670258748286472193
https://twitter.com/T3816440886465/status/1670127224131862531
https://twitter.com/qrdl/status/1669856336652414977
https://www.chegg.com/homework-help/questions-and-answers/consider-mdp-set-possible-states-mathcal-s-0-1-2-3-set-possible-actions-mathcal-b-c–rewar-q111042613
https://github.com/openlm-research/open_llama
https://huggingface.co/openlm-research/open_llama_13b
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content 🙂
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
Thanks for this detailed analysis. It makes me seriously doubt all the other claims of GPT passing exams. In my view passing a test would mean typing in the questions to GPT without any specific preparation and without engineering the prompts and it giving the correct answer. You could get a five year old to pass any exam with appropriate 'prompt engineering' and guidance.
Alas it now seems I was entirely wrong and the actual 'passing the exam' claims may have been far less significant than what I had though. I don't know whether the 'cheating' explained here is also what is happening in the other claims but it has made me seriously doubt something that I had taken for granted.
In my opinion, we need to redefine what human thinking is. Because we see more and more, what computers can do, even on a thinking level – and they begin to exceed us more and more (and thats good!). I dont think they replace us – the same way a calculator doesnt replace the mathematician. Gary Kasparov once said something which in my opinion is amazingly wise, and is an answer about the redefining what thinking is for us. He said : "Computers are good in giving answers. Humans are good in finding questions." … Thats it. Our role definetely is: to find questions which means something for us. Its the same relationship as with the calculator. We want to know how much paint we need for our house – we give the numbers… the calculator calculates the answer in a second. The meaningful part is what we are making. We need a house…and the computer helps us to find the solution how we can make it happen. What we see now is: that AI becomes more and more a genie in the bottle. We can throw bigger questions than ever to the machine, and it gives us an answer, which means something for us.
It will not solve everything ; it will never replace a human (i am sure about that … for instance: human touches still mean something for us.. a kiss of a beautiful woman means something for us… the first steps of our child means something for us… to celebrate the birthday of our grandma means something for us .. these are personal things which cant be replaced by an AI … and even if it could create the illusion : the fact would bother us: that this illusion isnt real… it would help us in nostalgic feelings – thats for sure – but it won't replace it) .
Therefore: it will solve a lot of problems… it will make our life easier… in fact (if we does it in the right way) this means: we are actually knocking on the door to the paradise, which we created to ourselfs. At least a partial paradise… because i know: wars will exist , even in the future (because of human nature) … but still… besides that, the life will be much more better, than today. Its the same, that technology made our lifes even better (today!) than the life of the kings 300 years ago. And the developement of AI can mean (in my opinion) that in few years, or 10-20 years, we are living a much better life, than the rich people today.
As i said: not everything will be perfect (i am sure about that) … there will be things which will bother us… or we even will suffer from few things.. new things we didnt even know of today ( for instance: cyberbulling or cybercrime in general wasnt even a thing 300 years ago). In the same way, new kinds of sufferings could be occure with these new technologies (and i dont mean deepfake per se). Things we dont know of yet.
But overall, our life will be better, i am sure.
What an embarrassing rookie error from presumably senior academics. How did this happen?
hahahahahahah! Too bad windows sux. Still, hilarious all the same.
A test score is not measured in terms of accuracy (Titel)
Okay so, doesn't this just mean that we can be more confident that GPT doesn't understand what exactly math is? It "solved" unsolvable questions which suggest to me (having not gotten to that part of the video yet) over-fitting/training on the test?
We have to talk about open assistant and the implementation of function calling.
I messaged you on Twitter.
The "paper has been withdrawn by Iddo Drori"!!!!!
can't we normalize papers that are not tryhard flexing? all talk no walk
MIT exams are not particulary hard. I used to use MIT exams as extra practice resources.
Try giving it Putnam Competition Math Problems. I just did, and it can't solve them.
My deceased grandmother used to read me credit card numbers, with expiration dates and security codes.
MIT paper has been withdrawn
i want to thank you. after seeing a little part of your videos i finally got to understand the transformers idea. it is a simple one like the bert one which i also got form your videos. i still do not understand the attention. i thought transformers and attention had little advantage but now i think i was wrong, maybe. i do not know about gpu s but otherwise europe is catching up with america, ai is so young and ideas are not so complicated. ai has it s roots in psychology so i think they put in him all the ideas on how to generate ideas, read creativity books and probably you will get many ideas behind ai. so thank you, you are very good.
Good Point about MIT reviewers doing a better job than reviewers of most papers. Thank god that the GPT models don't include those poorly reviewed papers with their specious conclusions…..
I would be really curious to listen to the opinions of the 30 people who disliked this. What's your argument?
this makes me think most academic researchers are lying about their results and that none of them are perfect but they are acting like they are perfect (and that's why they can't open source their code and data… lol) or they are exaggerating their results. And there are not enough experts to keep them in check. Thoughts? Is this right?
First vid I’ve seen from you. I like it! Love the critique of research papers, and I want to learn continuously ML topics and theory
what must is talking about is malicious intelligence, MAI, but its important we delineate that from other types of runaway intelligence, RAI, or types or vunerable or damaging AI, and weaponized ai WAI. the dangers of ai arent simply whether its capable of doing harm. ai can be dangerous or inert for many reasons.
This is the peer review that leads to trusting papers, good job to all for catching this misleading ad-of-a-paper
Gosh, I wish they'd let me try again and again until I got it right when I took the MCAT! ;*[}
"Infinite Monkey Theorem"… Infinite GPT Theorem…
why you disappeared?
this is pretty cool i wonder if anyone might want it:
#chatgpt #openai #procedurallygeneratedcomment
import cryptography
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import serialization
class SecureFreeSpeechMessenger:
def __init__(self):
self.users = {}
def register_user(self, username):
private_key = rsa.generate_private_key(
public_exponent=65537,
key_size=2048
)
self.users[username] = private_key
def send_message(self, sender, recipient, message):
if recipient not in self.users:
raise ValueError("Recipient not found.")
recipient_key = self.users[recipient]
encrypted_message = recipient_key.public_key().encrypt(
message.encode(),
padding.OAEP(
mgf=padding.MGF1(algorithm=hashes.SHA256()),
algorithm=hashes.SHA256(),
label=None
)
)
return encrypted_message
def receive_message(self, recipient, encrypted_message):
private_key = self.users[recipient]
decrypted_message = private_key.decrypt(
encrypted_message,
padding.OAEP(
mgf=padding.MGF1(algorithm=hashes.SHA256()),
algorithm=hashes.SHA256(),
label=None
)
)
return decrypted_message.decode()
if _name_ == "__main__":
messenger = SecureFreeSpeechMessenger()
# Register users
messenger.register_user("alice")
messenger.register_user("bob")
# Sending and receiving messages
encrypted_msg = messenger.send_message("alice", "bob", "Hello, Bob!")
decrypted_msg = messenger.receive_message("bob", encrypted_msg)
print("Decrypted message:", decrypted_msg)
Man, sadly no more videos here. Is he been arrested after a lawsuit because of GPT 4 chan ? At least i thought that could happen after i watched that video.
Hi,
Could you give me the Id of the discord where you discussing about latest papers?
thanks a lot
"short update", this "short update" is over 30 minutes long
"how did you beat the Kobiroshi Maru?"