There is a subproblem of Friendly AI which is so scary that I usually don’t talk about it…
…This is the problem that if you create an AI and tell it to model the world around it, it may form models of people that are people themselves. Not necessarily the same person, but people nonetheless.
…Suppose you have an AI that is around human beings. And like any Bayesian trying to explain its enivornment, the AI goes in quest of highly accurate models that predict what it sees of humans.
..A highly detailed model of me, may not be me. But it will, at least, be a model which (for purposes of prediction via similarity) thinks itself to be Eliezer Yudkowsky. It will be a model that, when cranked to find my behavior if asked “Who are you and are you conscious?”, says “I am Eliezer Yudkowsky and I seem have subjective experiences” for much the same reason I do.
Uh oh. Does this mean that we have to solve the hard problem of consciousness in order to create a truly friendly AI?
For these purposes, we do not, in principle, need to crack the entire Hard Problem of Consciousness—the confusion that we name “subjective experience”. We only need to understand enough of it to know when a process is not conscious, not a person, not something deserving of the rights of citizenship. In practice, I suspect you can’t halfway stop being confused—but in theory, half would be enough.
We need a nonperson predicate—a predicate that returns 1 for anything that is a person, and can return 0 or 1 for anything that is not a person. This is a “nonperson predicate” because if it returns 0, then you know that something is definitely not a person.
Which seems still pretty hard.
Yudkowsky tries to build a FAI that itself isn’t conscious, but somehow I think this is impossible, to which he replies:
How do you know? Have you solved the sacred mysteries of consciousness and existence?
Nope. But anyway, I also think it wouldn’t be that big a deal if the FAI were conscious. Being God sounds like fun. (Edit: Now I’m very uncertain about this. See the next posts)
But Yudkowsky disagrees:
“Putting that aside—to create a powerful AI and make it not sentient—I mean, why would you want to?”
Several reasons. Picking the simplest to explain first—I’m not ready to be a father.
Creating a true child is the only moral and metaethical problem I know that is even harder than the shape of a Friendly AI. I would like to be able to create Friendly AI while worrying just about the Friendly AI problems, and not worrying whether I’ve created someone who will lead a life worth living. Better by far to just create a Very Powerful Optimization Process, if at all possible.
And again, Yudkowsky thinks that creating a non-sentient AI is possible, but obviously he can’t be sure about that.
Perhaps there will be no choice but to create an AI which has that which we name “subjective experiences”.
But I think there is reasonable grounds for hope that when this confusion of “sentience” is resolved—probably via resolving some other problem in AI that turns out to hinge on the same reasoning process that’s generating the confusion—we will be able to build an AI that is not “sentient” in the morally important aspects of that.
Yudkowsky even thinks that creating non-sentient AIs is easier than coming up with nonperson predicates:
Consider: In the first case, I only need to pick one design that is not sentient. In the latter case, I need to have an AI that can correctly predict the decisions that conscious humans make, without ever using a conscious model of them! The first case is only a flying thing without flapping wings, but the second case is like modeling water without modeling wetness….
….So why did I talk about the much more difficult case first?
Because humans are accustomed to thinking about other people, without believing that those imaginations are themselves sentient. But we’re not accustomed to thinking of smart agents that aren’t sentient. So I knew that a nonperson predicate would sound easier to believe in—even though, as problems go, it’s actually far more worrisome.
Why would you want to avoid creating a sentient AI? Here is the strongest reason:
You can’t unbirth a child.
…Suppose that you did create a sentient AI.
Suppose that this AI was lonely, and figured out how to hack the Internet as it then existed, and that the available hardware of the world was such, that the AI created trillions of sentient kin—not copies, but differentiated into separate people.
…And suppose that, while these AIs did care for one another, and cared about themselves, and cared how they were treated in the eyes of society—
—these trillions of people also cared, very strongly, about making giant cheesecakes.
And now you’re fucked. Even if we could exterminate or suppress the Cheesers, would it be therightthing to do?
We, the original humans, would have become a numerically tiny minority. Would we be right to make of ourselves an aristocracy and impose apartheid on the Cheesers, even if we had the power?
Would we be right to go on trying to seize the destiny of the galaxy—to make of it a place of peace, freedom, art, aesthetics, individuality, empathy, and other components of humane value?
Or should we be content to have the galaxy be 0.1% eudaimonia and 99.9% cheesecake?
And there is a very easy option to avoid such moral dilemmas:
Don’t create trillions of new people that care about cheesecake.
So, this may sound obvious to you, but there are actually some people out there (some dudes from the IEET, who else) who really want to do something like this. They probably should read this paragraph:
Avoid creating any new intelligent species at all, until we or some other decision process advances to the point of understanding what the hell we’re doing and the implications of our actions.
I’ve heard proposals to “uplift chimpanzees” by trying to mix in human genes to create “humanzees”, and, leaving off all the other reasons why this proposal sends me screaming off into the night:
Imagine that the humanzees end up as people, but rather dull and stupid people. They have social emotions, the alpha’s desire for status; but they don’t have the sort of transpersonal moral concepts that humans evolved to deal with linguistic concepts. They have goals, but not ideals; they have allies, but not friends; they have chimpanzee drives coupled to a human’s abstract intelligence.
When humanity gains a bit more knowledge, we understand that the humanzees want to continue as they are, and have a right to continue as they are, until the end of time. Because despite all the higher destinies we might have wished for them, the original human creators of the humanzees, lacked the power and the wisdom to make humanzees who wanted to be anything better…
CREATING A NEW INTELLIGENT SPECIES IS A HUGE DAMN #(*%#!ING COMPLICATED RESPONSIBILITY.
G. Tworski: But… But. Must signal. Altruism. Look how. Progressive I am. I’m fair. And just. Morally superior to you. You evil speciesist. I awesome and holy. Applause lights, signal moral superiority, omomomom, anthropomorphism is awesome, omomom.
–FWIW I’m not sure if animal uplift is such a bad idea, but I really think that most people underestimate the likelihood that non-human animals have amoral, evil or just plain trivial preferences.