CEV applied to pebblesorters wouldn’t result in an AI that would do what is “right”. CEV applied to humans would. CEV is only a formal procedure to help us find out what our fundamental, “right” values are because we aren’t smart enough to do it alone.
And if that disturbs you, if it seems to smack of relativism – just remember, your universalizing instinct, the appeal of objectivity, and your distrust of the state of human brains as an argument for anything, are also all implemented in your brain. If you’re going to care about whether morals are universally persuasive, you may as well care about people being happy; a paperclip maximizer is moved by neither argument. See also Changing Your Metaethics.
So I still find it disturbing that most alien species have totally different values but that’s only because one of my values is that my values should be universal. Why not drop universality across all possible minds and keep laughter, beauty, love, etc. ? Dropping laughter, beauty, love, etc. and keeping universality doesn’t work, because there are many possible minds that don’t give a fuck about universality. (Or is this so? Encountering real aliens would really be useful.)
But if we drop universality, proposals like CEV make even less sense than before. And, you know, I really like the idea of universality…
And if we begin to change or abandon some of our values because they are impossible to fulfill in this crappy universe, well, then why not change all of our values so that we are blissed out by, say, iron atoms?
Very good comment by Yudkowsky:
“Richard, see “Invisible Frameworks”. In thinking that a universal morality is more likely to be “correct”, and that the unlikeliness of an alien species having a sense of humor suggests that humor is “incorrect”, you’re appealing to human intuitions of universalizability and moral realism. If you admit those intuitions – not directly as object-level moral propositions, but as part of the invisible framework used to judge between moral propositions – you may as well also admit intuitions like “if a moral proposition makes people happier when followed, that is a point in its favor” into the invisible framework as well. In fact, you may as well admit laughter. I see no basis for rejecting laughter and accepting universalizability.
In fact, while I accept “universalizability among humans” as a strong favorable property where it exists, I reject “universalizability among all possible minds” because this is literally impossible of fulfillment.
And moral realism is outright false, if interpreted to mean “there is an ontologically fundamental property of should-ness”, rather than “once I ask a well-specified question the idealized abstracted answer is as objective as 2 + 2 = 4”.
Laughter and happiness survive unchanged. Universalizability and moral realism must be tweaked substantially in their interpretation, to fit into a naturalist and reductionist universe. But even if this is not the case, I see no reason to grant the latter two moral instincts an absolute right-of-way over the first, as we use them within the invisible background framework to argue which moral propositions are likely to be “correct”.
If you want to grant universalizability and realism absolute right-of-way, I can but say “Why?” and “Most human minds won’t find that argument convincing, and nearly all possible minds won’t find that argument even persuasive, so isn’t it self-undermining?” “