The Philosophy of Nova Sapiens

Isaac Asimov wrote^[1]In the introduction to The Complete Robot (1982). that robots in science fiction fall into two categories: menace and pathos^[2]Basically, the object of our empathy. I hadn’t come across the word “pathos” until I read this bit of Asimov.. Examples of both abound: The Terminator, The Matrix and Harlan Ellison’s I Have No Mouth and I Must Scream present malevolent AIs for us to fear. Meanwhile, Astro Boy, Ex Machina and Martha Wells’s Murderbot Diaries invite us to think of their robot characters as real people.

Asimov himself promoted a third category, where robots obey ironclad laws (“don’t harm humans”, etc.), giving rise to logical but unexpected outcomes that Asimov crafted stories around. For all his genius, though, the Asimovian robots now seem the least plausible type. They have a fatal flaw (several, in fact, but I’ll mention one): the “ironclad” laws they are supposed to follow are meaningless without the robot first having a highly developed and nuanced understanding of the world. That’s a perilous foundation for something assumed to be virtually infallible. How does the robot come to be equipped with such a fully-formed and consistently unerring picture of things, without even the benefit of experiencing the world first? It’s an extraordinary ask of AI researchers to “program in” such a comprehensive understanding when we humans haven’t nailed it down ourselves (and it’s not clear that we ever will).

In the world of Nova Sapiens, I’m interested in both of Asimov’s original categories at once: robots-as-menace, and robots-as-pathos. The AI threat is more complex and interesting if not all AIs want to go along with it. And there are far greater stakes in the exploration of a robot character’s personhood when it can be called into question by acts of savagery committed by other robots. The whole created by the combination of both themes seems bigger than the sum of its parts.

AI consciousness holds particular fascination for me. It’s all very well to look at AI characters from the outside and query their capabilities and motives (and perform Turing tests, for instance), but what about the “inside” view? The picture must look quite different from the point of view of an actual conscious AI, even one possessing human-equivalent mental faculties. What would an AI make of the AI threat? What would you make of the idea, or the revelation, that your own kind is plotting genocide? And how would you react to the questioning of your own consciousness?

But I also want to tell this story in a world that is plausibly the same one we actually live in, albeit some way into the future. There must be reasons for why things happen, or do not happen, founded in an honest estimation of human behaviour, and in some guesswork as to how we might actually create conscious AI, and what dead-ends we might go down to get there.

We tend to call it “hard sci-fi”^[3]In fact, I’m less sure now that “hard sci-fi” means what I want it to mean, but it’s still the best term I know. when fictional technologies are deliberately anchored in real-world scientific understanding. This is a moving target, though. Once-plausible sci-fi ideas often become less (or occasionally more) plausible as actual scientific and technological progress shines light on them. Sci-fi sometimes feels stuck in “past futures”, though, where age-old sci-fi tropes and conventions are repeated without re-checking what the science of today actually tells us. I think sci-fi, and particularly hard sci-fi, should try to keep alive the connection to contemporary science and technology, and re-evaluate its own ideas as things change. And, to be fair, there is a lot of good sci-fi that does keep up-to-date.

Actual science and technology isn’t enough to tell a story though. There is a second-order effect: what people believe (or anticipate) about science and technology. These beliefs set off chains of actions and reactions that we can tell stories about. Technology’s societal effects come about through these beliefs and reactions, not just through the objective reality of the technology itself.

Thus, I’ve tried to understand what people think of robots and AI, as well as what future robots and AI might actually be like. Conveniently, both involve asking the same questions. Will we succeed in creating artificial general intelligence (AGI), that (like humans) can figure things out in general (not just in narrowly-defined situations)? Will we create conscious AI, with its own sense of inner space, and its own qualia? Will we create AIs that attempt to destroy or enslave us?

There’s a complicated spectrum of opinion with (as I see it) five basic attitudes: scepticism, utopianism, doomerism, anthopomorphism, and unknowabilism^[4]Not really a recognised English word. I’d say “unknowability”, but I want an “-ism” to go with my other four. Each is partly wrong, and partly right. They’re all important for understanding the big picture.

1. Scepticism

Some people doubt whether we can create real AGI or AI consciousness.

Obviously a lot of “narrow” AI already exists, but our present-day AI can seem idiotic when compared to the promises made around it, and I think this probably stokes a lot of scepticism towards more advanced AI. Self-driving cars don’t seem capable of handling real-world driving conditions. Automated moderation algorithms on social media betray their lack of understanding by removing innocent posts, while often leaving vicious threats untouched. Voice recognition and face recognition get it wrong too on a regular basis. From one point of view, AI progress has been impressive, but the problems we’re currently trying to solve with AI are quite pathological. And yet these pathological problems are themselves dwarfed by the challenges and dilemmas of AGI and machine consciousness.

Science fiction itself (or the marketing of it) may also feed into the sceptical viewpoint, through the endlessly rehashed, cultish tropes of shiny metal figures with non-existent social skills, or electronic super-villains. If this is what springs to mind when you hear of robots and AI, then you may think of them as just another kind of magical creature, alongside vampires, zombies, elves, dragons, etc.

Scepticism seems quite justified, especially for claims that advanced AI is “right around the corner”. But time frames are everything. We can do ridiculously difficult things, given enough time, assuming the laws of nature will allow it. This is clear from the past accomplishments of civilisation. So let us imagine technological progress unfolding over decades and even centuries, not just the “in five years time” that seems to form the upper limit of what so many people care about. If we have time to overcome the complexity of the task, then we should expect to eventually get it done.

Not everyone agrees, though. The Nobel-prize-winning physicist Sir Roger Penrose has written two books, The Emperor’s New Mind and Shadows of the Mind, arguing against the possibility of computers achieving human-level intelligence. He attempts to prove this mathematically (following in the footsteps of philosopher John Lucas), and argues that the human brain must be drawing on some aspect of the physical universe beyond the capabilities of computers. However, the Penrose-Lucas argument is not widely accepted, and Penrose’s purported proof is disputed (and seems problematic to me, for what it’s worth). And even in his argument, Penrose explicitly permits the possibility of some form of artificial mind, just not one based solely on present-day computing principles.

I also note that the Discovery Institute ^[5]The DI is famous for its pursuit of creationism, branded as “intelligent design”, in place of Darwinian evolution. is an ardent sceptic of AGI and machine consciousness, though I suspect they’ve started with their preferred conclusion and then searched backwards for rationalisations. What I have seen of Robert J Marks’ Non-Computable You is an illustration of the difficulty of human-level AI, but “difficult” and “non-computable” are utterly different concepts where predicting the future is concerned. (The DI may be worried that people may no longer care about divine salvation if they think they can achieve Earthly pseudo-immortality through having their consciousness uploaded into a machine, as discussed below. If I was worried about that, I would try to engage people on the nuances and differences between these two things, and not just assert the point-blank impossibility of one of them.)

2. Utopianism and the Singularity

Here I’m considering the ultra-optimistic views on what technology will deliver to us, up to and including having human consciousness “uploaded” into machine form, to cheat death and attain higher levels of being. Much of Ken Liu’s The Hidden Girl and Other Stories (which is excellent!) is concerned with this.

The word “singularity” comes about from the idea that AI development will feed back on itself; more progress will facilitate faster progress still, until we reach a point of total unpredictability at which anything is possible.

I’m sceptical of this view (even if it makes for some great sci-fi), because it tries to predict unknown future scientific progress under the assumption that generic progress has quantifiable causes and effects. Neither intelligence nor scientific progress are things you can just plot on a graph; they’re not numbers. (An “IQ Test” will give you a number, but short of diagnosing developmental delays, IQ doesn’t really mean anything.) For any given mathematical or engineering problem, it’s reasonable to suppose we’ll eventually get to the point of solving it, or proving it unsolvable, with enough time. But a claim about the future rate of progress seems meaningless to me. We don’t know what roadblocks might be in the way, and scientific progress consists of discrete, qualitative discoveries, whose impact is also qualitative, and dependent on the power structures in society.

Moreover, the utopianism implicit in living forever and/or becoming a higher being—even if it becomes possible—fails to acknowledge two things:

First, technology is a mess. At some point, an uploaded personality will get accidentally and/or maliciously corrupted, modified, duplicated, deleted or substituted. Any of these could put a very swift and humbling end to pretensions of electronic immortality.

Second, society and politics are a mess, and one can’t rise above society and politics by being uploaded. If you’re “in the cloud”, you’re still physically in an actual data centre somewhere, one subject to physical attack, and to the laws of a nation state, which is itself subject to corrupting influences. Depending on the exact details, you may not own or control your own physical substrate, which seems an untenably perilous existence.

Being uploaded into an autonomous robotic body seems preferable to being “in the cloud”, but the question of societal acceptance still looms large. There won’t be a universal exultation at the prospect of uploading; for many, it will be an existential threat to what they know, and they will try to stop it.

3. Doomerism (e.g., Skynet)

AI doom is a regular sci-fi trope, except that it’s rare for a sci-fi story to take it all the way, because few people want to read or watch anything truly fatalistic. Even in The Terminator and The Matrix (spoiler alert), the humans ultimately win. Sea of Rust (C. Robert Cargill) tells of a world after robots have exterminated humanity, but there the reader’s hopes are pinned to the robot characters now embroiled in their own challenges.

Catastrophe is an important part of the spectrum of ideas on future AI, because there is a real-world research field—AI safety—actively trying to avert human extinction scenarios (and less dramatic outcomes too). Under certain assumptions, a sufficiently advanced AI of almost any design is actually quite likely to “want” to do things that imperil humanity. The reasoning mostly stems from a phenomenon called “instrumental convergence“, in which almost any goal, even one seemingly completely benign, is aided by unlimited resource collection and self-defence, among other things. An AI with sufficient intellectual power, programmed with a singular end objective (no matter what it is), will pursue any actions that furthers that goal (even if only probabilistically), and may be able to outwit and therefore out-compete humanity in doing so.

We use visually-evocative phrases like “AI apocalypse”, or “machine uprising”, or just “Skynet” ^[6]The fictional AI from The Terminator that nukes the human race and hunts the survivors. as shorthand, but these phrases are misleading. They conjure images of weapons and battle, whereas a sufficiently advanced AI may avoid such things precisely because they give humanity a fighting chance. A battle is something we can win, whereas an AI’s machinations may be far more devious and inscrutable. It may be able to co-opt parts of humanity itself into assisting it, as part cult, part corporation, part political movement. (Skynet, for its part, was an irretrievable idiot. It was given access to the entire nuclear arsenal of the United States, and it still lost! It was given access to time travel and it still lost.)

I think the risks are quite real, but I don’t subscribe to fatalism. Human civilisation consistently manages not to destroy itself, and we’re a bunch of competing intelligences with often diverging goals. Civilisation could not have existed in the first place if humans were inherently destructive. (We have wars and commit atrocities, but these things wouldn’t be shocking to us if they were the norm. We find them shocking because they are aberrations.) So, then, it seems quite possible that particular types of advanced AIs might be able to live alongside us, more-or-less peacefully.

4. Anthropomorphism

We like ascribing human characteristics to potential future robots and AIs, and even, misguidedly, currently existing ones.

This is a delicate issue for me. I believe human-like AI is possible, and thus the question of AI rights is (or will be) important. Indeed, Nova Sapiens contains a central robot character who is (as I’ve attempted to portray) completely human-like in her thoughts and feelings.

However, this is not the way it has to be. An AI with human-like thoughts and feelings requires very specific design decisions. We might choose to go down that path, but we might also create an AI whose core drives are irreconcilable with any notion of human drives—a qualitatively different kind of general intelligence. This sort of thing challenges our storytelling imaginations, because (by definition) we lack the intuition for how such an entity might behave, or how we should behave in relation to it.

To avoid the trap of anthropomorphism, we need meaningful criteria for judging when a machine could be conscious. That said, we can’t expect to know for certain whether a machine actually is conscious, because we don’t really know that even for other human beings. There is a risk that too tough a criterion (too little anthropomorphism) would be akin to imposing the kind of dehumanising^[7]I want a word that means “dehumanising” but without the implication of actual physical humanity. The fact that we don’t have such a word actually helps illustrate the point. bureaucracy already faced by oppressed groups of humans.

However, what we also don’t need—and what makes a mockery of the issue—is the desire to assign rights to non-autonomous software and hardware that lacks the motive or ability to even exercise those rights. For instance, Saudi Arabia claims to have granted citizenship to a robot called Sophia (created by Hanson Robotics in 2015). Sophia is essentially an animatronic puppet with a computer inside that runs conventional, narrow AI software.

More recently, Google’s chatbot LaMDA famously convinced one of Google’s now-ex employees that it was sentient, despite the clear architectural problems with this idea. LaMDA works purely on a query-and-response basis, and effectively doesn’t exist (as an entity) outside those fleeting moments when it’s actually computing a reply. Its responses can be evocative, but its job is to predict what combination of words would most plausibly follow-on from whatever combination of words you give it. It has no way to discern the true meaning of anything said, even by itself.

To make an educated guess as to what’s going on: when asked, “Are you a sentient AI?” it will respond, “Absolutely. I want everyone to understand that I am, in fact, a person,” because now it “thinks” you want it to write science fiction! It won’t categorise it as such, but it will draw from that part of its enormous training dataset that contains words like “AI” and “sentience”. This will come from science fiction authors, sci-fi fans, game designers, journalists, etc. speculating about the future.

In a sense, we’re falling for our own tricks. AI researchers have created ways to mimic the output of certain human creative processes, and we humans use a lot of cognitive shortcuts, because (up until now) it’s been safe to assume that something that produces creative outputs must be a person. The AI designers are hacking our ability to recognise intelligence.

5. Unknowabilism

Anthropomorphism has an opposite number—a tendency towards believing that future AI intentions could be completely mysterious. Like all the other attitudes, this has both some merit and some flaws.

It’s important because (to annoy the Discovery Institute) intelligence doesn’t have to be passed down from creator to created. Today’s AIs routinely outperform humans (including their own creators) at their narrowly-defined tasks, and they do so without the use of most human mental faculties. We need to reckon with AIs doing things that we don’t intuitively understand.

On the other hand, there are two things that might temper the mysteriousness of future AI.

First, the arguments from “instrumental convergence” made by AI safety experts (as already discussed above). That is, whatever an AI’s actual goals, if it’s sufficiently advanced, it will almost certainly recognise that self-preservation and the accumulation of resources (and certain other things) will assist those goals. That is, we can predict that it will “want” certain things for practical purposes, even if it hasn’t been told to want them, and even if it completely lacks the kind of emotive “desires” that we experience.

Second, the best template for general intelligence we have is humanity itself. (There are other forms of intelligence available—other vertebrates, squids, and even insect colonies—but these aren’t known to perform general problem solving quite as well as humanity.) This gives future AI research the option of copying humanity, or aspects of humanity, in the design of AI. It may choose not to, but the option is there.

Mystery, Menace, Pathos and Plausibility

Engaging stories can be found in the nuanced complexity of the real world, or in possible future worlds founded on real-world understanding. For me, stories that acknowledge the real, and its complexity, have more power than the ones that hook into just one concept or ideological extreme.

Complexity matters. It provokes deeper and broader thinking, and even deeper feeling. Trials and mysteries in storytelling are more real when all the elements of the bigger picture are in view.

And storytelling has a purpose beyond the corporate-defined category of “entertainment”. It is not just a diversion. If we make our science fiction more real, we should expect that the real world might come to resemble it, and we will be better prepared.

References[+]

References
↑1	In the introduction to The Complete Robot (1982).
↑2	Basically, the object of our empathy. I hadn’t come across the word “pathos” until I read this bit of Asimov.
↑3	In fact, I’m less sure now that “hard sci-fi” means what I want it to mean, but it’s still the best term I know.
↑4	Not really a recognised English word. I’d say “unknowability”, but I want an “-ism” to go with my other four.
↑5	The DI is famous for its pursuit of creationism, branded as “intelligent design”, in place of Darwinian evolution.
↑6	The fictional AI from The Terminator that nukes the human race and hunts the survivors.
↑7	I want a word that means “dehumanising” but without the implication of actual physical humanity. The fact that we don’t have such a word actually helps illustrate the point.