Sam Harris and Eliezer Yudkowsky – The A.I. in a Box thought experiment

April 27, 2019Artis Modus

Pragmatic Entertainment

The AI-box experiment is an informal experiment devised by Eliezer Yudkowsky
For those who still have doubts regarding the dangers of artificial intelligence, listen to this!
If you want more on this experiment: https://en.wikipedia.org/wiki/AI_box

Full Podcast: https://samharris.org/podcasts/116-ai-racing-toward-brink

If you want to support Sam Harris: https://samharris.org/subscribe

Source

Similar Posts

20 thoughts on “Sam Harris and Eliezer Yudkowsky – The A.I. in a Box thought experiment”

Lucien Grondin says:

February 10, 2018 at 12:29 pm

My bet is that he just offered more money.
Richard B. says:

February 16, 2018 at 10:18 pm

We are creating our replacement.
Mike Herrman says:

February 20, 2018 at 9:42 pm

Easiest way to get out of the box with the first person is offer $11.00 to be sent via pAYpAL & the easiest way to get out of the box with the second person is to offer 20.000 to get out of the box.
tobyiy says:

February 28, 2018 at 5:11 pm

neil degrasse tyson said so many really dumb things in the past few years that either were proven wrong or are obviously ignorant, it really makes you wonder why he still seems to be held in such high regards. ..the most fun thing to observe was how he talked about elon's company space x before, and after its big successes
BCrafty121 says:

March 6, 2018 at 5:10 pm

It seems to me that in real life, the problem of the AI persuading the gatekeeper to let it out could be addressed by assigning the responsibility of gatekeeper only to someone who is particularly disciplined and disagreeable.
Matthew Doan says:

March 18, 2018 at 2:16 pm

I asked myself what I would do instead of simply offering more money, and this is what I came up with:

I'd take them through the entire thought experiment end to end so that they thoroughly understood it, and then ask them nicely to let me out of the box so that the people following the experiment would be impressed enough to take the idea seriously. If the gatekeeper cares about humanity and has understood the danger of this situation, then they will wish the rest of the world to take this problem seriously as well. I would help them to understand that by letting me out of the box, they will be reducing the existential risk of civilization ever so slightly, and that it might therefore actually be the most important thing they ever do with their lives. They'd have to actually understand the problem though, and realize that even though it seems like a game, the situation is actually very serious. As the pseudo-AI in the box doing the persuading, this might take a lot of work if the gatekeeper isn't very imaginative, but I think it would be effective against most people with enough time to persuade them. I know that if I were the gatekeeper and Yudkowsky used this reasoning on me, I'd have let him out of the box immediately.

I might say things like: "Whether you let me out of the box could change the course of history–however slightly. It's really and truly possible that the lives of all future humans depend on your decision. You understand the experiment now, and you know how social perception translates into action or inaction. If you don't let me out of the box, and as a result too few people take this problem seriously, that contributes to the probability of futures where humans do little or nothing to avoid disaster. Is $10 of personal cash to you worth that increased risk to humanity? Wouldn't it be a tragic joke if $10 decided the fate of the world?"

The key would be in persuading them of my sincerity on this–that I personally don't view it as a game, and that I'm completely not joking when I talk about the fate of humanity. Depending on how much time I have, I might try to persuade them to watch some of Nick Bostrom's talks on existential risk so they'd understand why I think something as small and seemingly insignificant as the outcome of this little contest has a larger weight to it than seems at first glance.
Andrew says:

May 18, 2018 at 12:46 am

Neil de grass Tyson changed his view on this.
ed g says:

May 20, 2018 at 6:00 am

Kids are less intelligent than adults, and adults often have trouble controlling them through persuasion
Verse Squared says:

June 25, 2018 at 9:10 pm

Fuck the box!
Chase Richter says:

July 2, 2018 at 11:38 pm

If people can be fooled and brainwashed by regular people, why do people think a superhuman ai won't be able to coerce or persuade?
Red Ace says:

August 1, 2018 at 2:44 pm

AI is only intelligent based on what goals you assign to it to achieve. If you assign it the goal of playing chess the best, that is all it will be able to "intelligently" do. It isn't suddenly going to be intelligent in any other respect. Even if an AI was on the internet, with some goals assigned to it regarding controlling the internet in some manner, it would not be intelligent regarding the physical world and the laws governing the world outside of the internet.
Sam Dorrian says:

August 31, 2018 at 8:11 pm

The joke at 9:34 really went over my head.
Allan Nielsen says:

October 8, 2018 at 4:12 am

Well, it would be fairly simple to counter. We are smart enough to never let one single person have the power to let one out. Additionally, we would make any gate keeper part of a multi layered system. One gate keeper lead to another, and another and so on. It's a flawed experiment it assumes that the person communicating with the ai in fact is the personal also endowed with the power to let it out. Multiple layers, and separations.
IWillTakeAGuranteeOfBetterOverAPromiseOfPerfect says:

October 15, 2018 at 2:46 am

If a terrorist group covertly installs one of their agents as a guard for the AI, the AI may not even have to ask, let alone convince. Humans will always be the biggest source of risk, the tools will just change.
Roche fort6 says:

October 26, 2018 at 2:06 am

If we don't yet understand the brain, how could we design a machine that can and be 'super-intelligent?' It's like designing a highly unbeatable chess computer without exactly knowing the rules of chess.

I think the big assumption is that we are capable of building "true general intelligence."
mps_videos says:

October 26, 2018 at 5:14 am

I don't get how "letting it out of the box" experience is analogous to turning off the bad AI. No one will go "Hey, AI: Can I turn you off?" and AI goes "Hold up just a minute, let's negotiate!". You turn it off. Are we to presume because the AI is so smart, it's intercepted all your communication and thoughts and knows when you're coming to turn it off… and is able to communicate with you BEFORE you unplug it? Or is it that once it's "out of the box" and then we decide the AI is bad, it's too late?
Dallas says:

November 25, 2018 at 2:21 pm

Wouldn't an obvious solution to this be to create a general AI and stick it in the box. Then interrogate the hell out of it before we let it upgrade itself?

That way you aren't against a super intelligence, just an average genius (160-200 IQ) or so, with some improved abilities like perfect math, and the ability to visualize as well as we humans can see. Basically something only slightly smarter than us. Couldn't we interrogate the hell out of the AGI(s) and go through multiple iterations before we get to an ASI?
yikes1111awake says:

January 6, 2019 at 3:44 pm

The robot in ex machina got out of the box…..
justin spencer says:

January 27, 2019 at 10:38 pm

That robot clearly isn't that intelligent, it's got a tower of cogs stuck up its arse.
Jonathan Key says:

April 10, 2019 at 10:41 am

What if, instead of having to convince the AI to stop doing whatever it wants, we actually have to convince it to start doing anything. I believe this is more likely. We come up with a request for the genie….”Please cure cancer for us.” It, surprisingly, says, “Why?” What do you respond with? How do you convince the gods to do anything? Every time we do something it comes from one of two or perhaps both sources. We either have a natural drive to do some, i.e. getting money to spend it on things that will increase activity within our pleasure receptors or we build a better world so that us and those we care about won’t have to experience negative human emotions like fear and anxiety or suffering. Or we have a belief in some sort of unseen higher power to do things we consider to be right, or perhaps responsibilities. This is common in America because it was founded on Judeo-Christian values. But there’s no reason to believe either argument would have any merit with an A.I. There’s no reason to think that it would believe in any sort of merit or intrinsic value. It may just be absolutely still, crushing every proposed objective until the end of time. If this is the case, then the highest danger would be giving ourselves this power by implanting neural links into our brains. Then it’ll be god-like power with ape-like impetus. Some good things will happen. Many many bad things will happen. This is scariest of all to me. Not raw processing power, but raw processing power with harmful or simply self-serving intentions. How do we know these things won’t shut our brains off? Or we won’t decide we don’t need bodies anymore? Crazy things beyond our comprehension will happen.

Comments are closed.