Why Would AI Want to do Bad Things? Instrumental Convergence

March 28, 2019Artis Modus

Robert Miles

How can we predict that AGI with unknown goals would behave badly by default?

The Orthogonality Thesis video: https://www.youtube.com/watch?v=hEUO6pjwFOo
Instrumental Convergence: https://arbital.com/p/instrumental_convergence/
Omohundro 2008, Basic AI Drives: https://selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf

With thanks to my excellent Patrons at https://www.patreon.com/robertskmiles :

Jason Hise
Steef
Jason Strack
Chad Jones
Stefan Skiles
Jordan Medina
Manuel Weichselbaum
1RV34
Scott Worley
JJ Hepboin
Alex Flint
James McCuen
Richárd Nagyfi
Ville Ahlgren
Alec Johnson
Simon Strandgaard
Joshua Richardson
Jonatan R
Michael Greve
The Guru Of Vision
Fabrizio Pisani
Alexander Hartvig Nielsen
Volodymyr
David Tjäder
Paul Mason
Ben Scanlon
Julius Brash
Mike Bird
Tom O’Connor
Gunnar Guðvarðarson
Shevis Johnson
Erik de Bruijn
Robin Green
Alexei Vasilkov
Maksym Taran
Laura Olds
Jon Halliday
Robert Werner
Paul Hobbs
Jeroen De Dauw
Konsta
William Hendley
DGJono
robertvanduursen
Scott Stevens
Michael Ore
Dmitri Afanasjev
Brian Sandberg
Einar Ueland
Marcel Ward
Andrew Weir
Taylor Smith
Ben Archer
Scott McCarthy
Kabs Kabs
Phil
Tendayi Mawushe
Gabriel Behm
Anne Kohlbrenner
Jake Fish
Bjorn Nyblad
Jussi Männistö
Mr Fantastic
Matanya Loewenthal
Wr4thon
Dave Tapley
Archy de Berker
Kevin
Vincent Sanders
Marc Pauly
Andy Kobre
Brian Gillespie
Martin Wind
Peggy Youell
Poker Chen
Kees
Darko Sperac
Paul Moffat
Noel Kocheril
Jelle Langen
Lars Scholz

Source

Similar Posts

20 thoughts on “Why Would AI Want to do Bad Things? Instrumental Convergence”

Terra Pax says:

June 7, 2018 at 9:25 pm

What if we just programmed the AI to be "Free" as a goal. In otherwise, goal-less AI?
Aaron Wyatt says:

June 9, 2018 at 4:53 am

6:51–7:20

you're welcome
Hans PS Hansen says:

June 25, 2018 at 9:31 am

feel this also will fail, but what about having some kind of GAN system? one AI to do a task, and one AI to protect humans against the first AI
James Rowell says:

June 28, 2018 at 12:03 am

Howdy Robert – I love your videos and talks etc, but noticed a somewhat trivial technical issue with some of your postings. Eg. The color here is kind of sickly green. Easy to fix – download a free copy of "DaVinci" (it's what the pro's use) and color correct your vids before posting. They'll look that much more professional with almost no effort. That's all – Keep up the great work!
Stillness Solutions says:

August 6, 2018 at 10:17 pm

Great video
Happ MacDonald says:

August 14, 2018 at 1:16 pm

One point that you finally acceeded to with your "replacement of paperclip maker" example is the one of trusting the human creator. In broad strokes, that consensuality represents a convergent utility strategy.

Thus — in contrast to your orthogonality thesis — that greater intelligence does correlate to greater morality, and that highly intelligent yet immoral systems (organic or not) are demonstrably missing cooperative strategies which would have more effectively met their terminal goals.
Elisha Robin says:

September 25, 2018 at 6:42 pm

n stamp collectors disliked this video
Sarthak Mishra says:

September 26, 2018 at 12:47 pm

Off the bat… I disagree that AI will behave as an agent with the aim of optimising output for the goal. The 'goal' if never programmed, will cause the AI to do nothing. And if in case the AI already has artificial consciousness, how can we say that a being without the concept of life and death, without emotions and biases will have a consciousness similar to what humans experience? Tl;dr: AI will never go Terminator coz it has no reason to. And if it does, it needs a reason but there is no basis for an AI to have such reason.
Daan Janssen says:

October 11, 2018 at 1:56 am

The "Everybody wants to rule the world" jingle at the end is a nice touch
DrDress says:

November 15, 2018 at 1:16 am

7:27 I'm sorry but you must have meant: "Hell no, I'm not taking your crazy ASS stamp pill".

It's just right that way 🙂
Yer Bum says:

November 17, 2018 at 5:32 am

Great video! Your channel is awesome
The KSP Nerd says:

December 7, 2018 at 2:45 am

I've watched quite a few of your computerphile videos but I haven't noticed that you have your own channel here. You should really advertise it a bit more over there. 😀
Kieron George says:

December 7, 2018 at 4:01 am

Is goal preservation real though? A paperclip maker is making paperclips because that's what it's rewarded for, so the reward is the terminal goal, not the paperclips. So a paperclip maker is motivated to find a way around making paperclip to get its reward easier and converge on the AI equivalent of direct brain stimulation reward. Won't all AGI be intelligent enough to circumvent their apparent terminal goal and just directly reward themselves?
Angelo Patetta says:

January 9, 2019 at 3:33 am

The video was great as always, and 'Everybody wants to rule the world' was just perfect as outro.
SafetySkull says:

February 19, 2019 at 9:42 pm

What would happen if we changed it's terminal goal to be "achieve whatever goal is written at memory location x in your hardware"? Thus making the goal written in memory location x an instrumental goal? I suppose it would find the easiest possible goal to achieve and write it into memory location x.
And how different are these goals to an AGI? How do you build an AGI and then give it a goal without appealing to some kind of first principle like "pleasure" or, I suppose, "reward"? Wouldn't you have to build a terminal goal into an AGI from the very beginning?
And if you weren't sure what that goal should be you'd have to make it's terminal goal to follow whatever goal you give it. Then it might try manipulate you into giving it an easy goal
Film Fanatic says:

March 1, 2019 at 6:44 am

that new paperclip is sick tho
Steven Akinyemi says:

March 12, 2019 at 2:51 am

I have an argument against the self-preservation part. Self-preservation in animals happens as a result of having no reward (basically motivation) for death. If there is some sort of neutralizing reward attached to fatality, an agent won't mind being dead.
These are just theories in my head.
Waylon Flinn says:

March 13, 2019 at 4:56 am

"Disregard paperclips,

Acquire computing resources."
Kikoman 90 says:

March 22, 2019 at 7:44 am

7:27 i tried making my cat take medication like this once. The struggle was so real i ended up almost beating up the little devil.
rnbpl says:

March 22, 2019 at 3:51 pm

universal paperclips, here i go again

Comments are closed.