Most FAI-designs (e.g. the one by Hibbard) are not very clever:
What on Earth could someone possibly be thinking, when they propose creating a superintelligence whose behaviors are reinforced by human smiles?
…Well, you never do know what other people are thinking, but in this case I’m willing to make a guess. It has to do with a field of cognitive psychology called Qualitative Reasoning.
…How many problems are there with this reasoning?
Let us count the ways…
Problem 1: There are ways to cause smiles besides happiness.
…Tiny molecular photographs of human smiles – or if you rule that out, then faces ripped off and permanently wired into smiles – or if you rule that out, then brains stimulated into permanent maximum happiness, in whichever way results in the widest smiles…
Problem 2: What exactly does the AI consider to be happiness?
…As discussed in Magical Categories, the super-exponential size of Concept-space and the “unnaturalness” of categories appearing in terminal values (their boundaries are not directly determined by naturally arising predictive problems) means that the boundary a human would draw around “happiness” is not trivial information to infuse into the AI.
Problem 3: Is every act which increases the total amount of happiness in the universe, always the right thing to do?
…If everyone in the universe just ends up with their brains hotwired to experience maximum happiness forever, or perhaps just replaced with orgasmium gloop, is that the greatest possible fulfillment of humanity’s potential? Is this what we wish to make of ourselves?
…what if the AI has to choose between a course of action that leads people to believe a pleasant fiction, or a course of action that leads to people knowing an unpleasant truth?
I think wireheading or creating satisfied heroin junkies are really not that horrifying scenarios, at least if we believe in Yudkowsky’s metaethics which seems rather relativistic anyhow.
Just like you wouldn’t want an AI to optimize for only some of the humans, you wouldn’t want an AI to optimize for only some of the values. And, as I keep emphasizing for exactly this reason, we’ve got a lot of values.
These then are three problems, with strategies of Friendliness built upon qualitative reasoning that seems to imply a positive link to utility:
The fragility of normal causal links when a superintelligence searches for more efficient paths through time;
The superexponential vastness of conceptspace, and the unnaturalness of the boundaries of our desires;
And all that would be lost, if success is less than complete, and a superintelligence squeezes the future without protecting everything of value in it.