The fact that I find arguments from universality or objectivity compelling is itself a fact of my own peculiar moral framework:
But—of course—when a Pebblesorter regards “13 and 7!” as a powerful metamoral argument that “heaps of 91 pebbles” should not be a positive value in their utility function, they are asking a question whose answer is the same in all times and all places. They are asking whether 91 is prime or composite. A Pebblesorter, perhaps, would feel the same powerful surge of objectivity that Roko feels when Roko asks the question “How many agents have this instrumental value?” But in this case it readily occurs to Roko to ask “Why care if the heap is prime or not?” As it does not occur to Roko to ask, “Why care if this instrumental goal is universal or not?” Why… isn’t it just obvious that it matters whether an instrumental goal is universal?
The Pebblesorter’s framework is readily visible to Roko, since it differs from his own. But when Roko asks his own question—”Is this goal universally instrumental?”—he sees only the answer, and not the question; he sees only the output as a potential variable, not the framework.
And this difficulty of the invisible framework is at work, every time someone says, “But of course the correct morality is just the one that helps you survive / the one that helps you be happy“—implicit there is a supposed framework of meta-moral arguments that move you. But maybe I don’t think that being happy is the one and only argument that matters.
But what happens if we confess that such thinking can be valid? What happens if we confess that a meta-moral argument can (in its invisible framework) use the universalizing instinct? Then we have… just done something very human. We haven’t explicitly adopted the rule that all human instincts are good because they are human—but we did use one human instinct to think about morality. We didn’t explicitly think that’s what we were doing, any more than PA quotes itself in every proof; but we felt that a universally instrumental goal had this appealing quality of objective-ness about that, which is a perception of an intuition that evolved. This doesn’t mean that objective-ness is subjective. If you define objectiveness precisely then the question “What is objective?” will have a unique answer. But it does mean that we have just been compelled by an argument that will not compel every possible mind.
So, my desire for universal and non-relative moral norms is itself totally subjective? Hm, that’s disturbing. But sure, there are possible minds who would find the existence of an objective morality horrible. But if there existed an objective morality then these minds would be evil.
It’s my personal, subjective preference to adhere to an objective morality. Ok, granted. But if there existed such a morality then the universe itself would agree with me, so my personal preference wouldn’t be that subjective after all.
If it’s okay to be compelled by the appealing objectiveness of a moral, then why not also be compelled by…
…life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom…
Such values, if precisely defined, can be just as objective as the question “How many agents do X?” in the sense that “How much health is in this region here?” will have a single unique answer. But it is humans who care about health, just as it is humans who care about universalizability.
Still. I think Yudkowsky is on to something…
Marcello Herreshof demolishes Roko’s proposal:
“Exactly. But you can come up with an much harsher example than aimlessly driving a car around:
In general it seems like destroying all other agents with potentially different optimization criteria would be have instrumental value, however, killing other people isn’t, in general, right, even if, say, they’re your political adversaries.
And again, I bet Roko didn’t even consider “destroy all other agents” as a candidate UIV because of anthropomorphic optimism.
Incidentally Eliezer, is this really worth your time?
I thought the main purpose of your taking time off AI research to write overcoming bias was to write something to get potential AI programmers to start training themselves. Do you predict that any of the people we will eventually hire will have clung to a mistake like this one despite reading through all of your previous series of posts on morality?
I’m just worried that arguing of this sort can become a Lost Purpose.”
Wow, maybe these guys are just too many levels above mine. Some people just find Yudkowsky’s metaethics to be obviously correct. And they are usually very smart. OTOH they mostly disagree with the “ethical unity of mankind”-hypothesis…
Anyway, this was the last post of the meta-ethics sequence.