In his reckless youth Yudkowsky made the same mistakes as everyone else when thinking about the FAI-problem.
So Eliezer1996 is out to build superintelligence, for the good of humanity and all sentient life.
At first, I think, the question of whether a superintelligence will/could be good/evil didn’t really occur to me as a separate topic of discussion. Just the standard intuition of, “Surely no supermind would be stupid enough to turn the galaxy into paperclips; surely, being so intelligent, it will also know what’s right far better than a human being could.”
…But wait. It gets worse.I don’t recall exactly when—it might have been 1997—but the younger me, let’s call him Eliezer1997, set out to argue inescapably that creating superintelligence is the right thing to do. To be continued.
Very important comment by Michael Anissimov:
“So, what if it becomes clear that human intelligence is not enough to implement FAI with the desirable degree of confidence, and transhuman intelligence is necessary? After all, the universe has no special obligation to set the problem up to be humanly achievable.
If so, then instead of coming up with some elaborate weighting scheme like CEV, it’d be easier to pursue IA or have the AI suck the utility function directly out of some human — the latter being “at least as good” as an IA Singularity.
If programmer X can never be confident that the FAI will actually work, with the threat of a Hell Outcome or Near Miss constantly looming, they might decide that the easiest way out is just to blow everything up.”
These two points are really crucial. Probably the main reasons why I currently think that donating to SIAI may not be the right strategy.
When we last left off, Eliezer1997, not satisfied with arguing in an intuitive sense that superintelligence would be moral, was setting out to argue inescapably that creating superintelligence was the right thing to do.
Well (said Eliezer1997) let’s begin by asking the question: Does life have, in fact, any meaning?
“I don’t know,” replied Eliezer1997 at once, with a certain note of self-congratulation for admitting his own ignorance on this topic where so many others seemed certain.
Funny anecdote. When I’ve discovered the following, erroneous (according to Yudkowsky) argument I was sold on transhumanism. I still think it’s kinda awesome.
“But, if we suppose that life has no meaning—that the utility of all outcomes is equal to zero—that possibility cancels out of any expected utility calculation. We can therefore always act as if life is known to be meaningful, even though we don’t know what that meaning is. How can we find out that meaning? Considering that humans are still arguing about this, it’s probably too difficult a problem for humans to solve. So we need a superintelligence to solve the problem for us. As for the possibility that there is no logical justification for one preference over another, then in this case it is no righter or wronger to build a superintelligence, than to do anything else. This is a real possibility, but it falls out of any attempt to calculate expected utility—we should just ignore it. To the extent someone says that a superintelligence would wipe out humanity, they are either arguing that wiping out humanity is in fact the right thing to do (even though we see no reason why this should be the case) or they are arguing that there is no right thing to do (in which case their argument that we should not build intelligence defeats itself).”
My version goes like this: If there is some objective morality out there, surely a superintelligent AI implementing CEV would discover it or an objective morality would be impossible to discover (at least for humanity). Why? Because humans are either capable of discovering it or not. If they are then a superintelligent AI that implements the “essence” of human morality, i.e. CEV, will discover it a fortiori, because it’s more intelligent, has more time to think, etc. If the superintelligent AI that implements CEV is not able to discover the true morality, we couldn’t discover it a fortiori.
And if there is no objective morality, then our existence is some kind of sick joke, so it doesn’t matter anyway.
The big problem is of course that I don’t want to be tortured for all eternity by an uFAI that was a “near miss” regardless of there existing an objective morality or not. And being blissed out for all eternity is pretty nice, even without an objective morality….
Obviously, Yudkowsky thinks the above argument is complete shit:
How flawed is Eliezer1997‘s argument? I couldn’t even count the ways. I know memory is fallible, reconstructed each time we recall, and so I don’t trust my assembly of these old pieces using my modern mind. Don’t ask me to read my old writings; that’s too much pain.
But it seems clear that I was thinking of utility as a sort of stuff, an inherent property. So that “life is meaningless” corresponded to utility=0. But of course the argument works equally well with utility=100, so that if everything is meaningful but it is all equally meaningful, that should fall out too… Certainly I wasn’t then thinking of a utility function as an affine structure in preferences. I was thinking of “utility” as an absolute level of inherent value.
I was thinking of should as a kind of purely abstract essence of compellingness, that-which-makes-you-do-something; so that clearly any mind that derived a should, would be bound by it. Hence the assumption, which Eliezer1997 did not even think to explicitly note, that a logic that compels an arbitrary mind to do something, is exactly the same as that which human beings mean and refer to when they utter the word “right”…
Here is another thought that IMHO could very well be true:
Thus (he said) there are three “hard problems”: The hard problem of conscious experience, in which we see that qualia cannot arise from computable processes; the hard problem of existence, in which we ask how any existence enters apparently from nothingness; and the hard problem of morality, which is to get to an “ought”.
These problems are probably linked. For example, the qualia of pleasure are one of the best candidates for something intrinsically desirable. We might not be able to understand the hard problem of morality, therefore, without unraveling the hard problem of consciousness. It’s evident that these problems are too hard for humans—otherwise someone would have solved them over the last 2500 years since philosophy was invented.
It’s funny, I forgot that I read this argument but two days ago I wrote a short essay that made essentially the same argument. We repeat ourselves more often than we think.
And again, Yudkowsky thinks that this argument is bogus although I don’t understand exactly why. I believe it’s not unlikely that we lack a fundamental insight that would be required for understanding morality. I mean, e.g. you can’t understand epistemology without natural selection. You can’t understand altruism without the gene-centric view of evolution. Maybe we need another revolution like those.