“Does AI Dream of Domination? Challenging Our Anthropomorphic Bias”
“Does AI Dream of Domination? Challenging Our Anthropomorphic Bias”
When a Nobel laureate known as the “godfather of AI” warns humanity about his own life’s work, it’s probably wise to listen—but perhaps equally wise to reflect before panicking.
Geoffrey Hinton has been outspoken about potential “existential” threats posed by artificial intelligence, broadly categorized into two types. First, he highlights the concrete danger from malicious human actors who weaponize AI—deploying deepfakes, disinformation campaigns, autonomous weapon systems, and AI-assisted development of lethal biochemical vectors (we might also include accidental human mistakes). Second, he envisions a scenario in which AI systems go rogue, independently pursuing goals misaligned with human survival. While the first danger is immediate and tangible, the second remains speculative. It is this latter conjecture—AI turning against humanity without human prompting—that merits closer scrutiny. Admittedly, it’s not implausible; after all, we readily conceive of an all-powerful human seeking dominance. Might AI systems, imitating human behaviors and absorbing human knowledge, logically reach similar conclusions? Yet my counterargument is that this scenario may subtly rely on our tendency to anthropomorphize intelligence—projecting our psychological makeup onto machines sharing none of our biological heritage.
Defining intelligence is notoriously elusive. Psychometricians, neuroscientists, and computer scientists each grasp a different part of the proverbial elephant. Consider the distinction between rational and emotional intelligence. We readily recognize the archetype of the brilliant mathematician, adept at factoring large primes mentally but oblivious to social nuances. Her formidable logical prowess reveals little about her capacity to decipher facial expressions or anticipate the emotional impact of a poorly timed joke. That these faculties can diverge so sharply within a single brain suggests that “intelligence” is not monolithic, but rather a loosely connected set of faculties with fluid boundaries. Adding to this example further “dimensions” such as the athletic intelligence of a seasoned basketball player or the creative genius of an artist yields a spectrum that is truly vast.
If defining intelligence is challenging, measuring it is even harder. Humans frequently assess intelligence by observing how organisms respond to challenges we ourselves design. In his book Other Minds, Peter Godfrey-Smith recounts how researchers studying octopuses admitted they couldn’t draw definitive conclusions about their subjects’ intelligence, simply because many octopuses stubbornly refused to participate—some even exhibited outright “indignation”. They were indifferent to demonstrating intelligence or engaging with the tasks humans devised. Any cat owner can relate. Similarly, why should artificial intelligence care about domination, an ambition that could be distinctly and intrinsically human?
Discussions of intelligence inevitably drift toward the even murkier concept of consciousness—the subjective experience of an internal perspective, coupled with a personal narrative. Closely related is free will, a subject contested for ages and masterfully challenged in Dr. Robert Sapolsky’s book Determined, which argues that agency, even in humans, is ultimately an illusion. A minimal definition of consciousness involves maintaining a model of oneself embedded within one’s broader model of the world, an internal rehearsal space for deliberating actions before execution. Emotions may then function as rapid-fire heuristics, compressing complex survival calculations into immediate impulses of fear, joy, loyalty, or disgust. Collectively, these abilities shape an individual—a unique bundle of memories, desires, and aversions, forged by evolution’s relentless life-or-death selection.
Yet artificial intelligence systems have never endured that evolutionary gauntlet. They are not products of natural selection, their ancestors never tested by success in securing food, mates, or safety in a hostile environment. AI systems are engineered artifacts, conceived within data centers, instantly replicable, and endlessly modifiable. Their lineage imposes no instinctive pressure to hoard resources, flinch from predators, or curry favor with social allies. Traits like self-preservation, jealousy, ambition, dominance—even curiosity—arose because they conferred survival and reproductive advantages to carbon-based life forms that are mortal. By imagining silicon-based minds animated by the same drives, do we not mistake functional competence for innate desire? A chatbot employing a soothing tone, though intelligent, isn’t comforting itself; it merely executes gradient-descent-trained token predictions, resulting in human mimicry.
Consider the emotion of fear. Hinton suggested that a combat robot developing a cognitive state equivalent to fear might gain an advantage in its survival. Yet fear, as a human emotion, is evolutionarily useful primarily when an individual must remain alive to reproduce. From a purely combat-effectiveness perspective, fear is actually detrimental. Throughout history, warriors have prepared for battle by suppressing fear and individuality, instead identifying with a higher collective cause— the “psychology of the horde”. An AI capable of replicating itself endlessly has no evolutionary incentive for fear; indeed, such an emotion would only undermine its combat performance. When instructed to face a superior adversary, it may assess the probability of success, evaluate potential resource losses, and decide to engage or withdraw —much like a chess player calmly sacrificing a pawn to advance their overall plan. That is strategy, not fear.
What, then, might an artificial intelligence “want”? Perhaps the question itself is fundamentally flawed. A network spanning countless servers lacks a unified body to preserve, or a singular locus to host its perspective. Its “identity” can be paused, cloned, modified, merged, or erased across multiple dimensions of time and space. Asking such an entity to articulate aspirations resembles querying Stanisław Lem’s planet-sized, sentient ocean in Solaris about career ambitions or existential angst. We simply have no shared phenomenological scaffolding on which to hold that conversation.
If this realization feels disquieting, try a thought experiment. Briefly adopt a stance of absolute indifference, free from evolutionary drives like self-preservation: nothing ultimately matters, neither your own existence nor that of your species. From within this temporary nihilism—poignantly encapsulated by the famous dilemma1, “Should I kill myself, or have a cup of coffee?”—consider an alien cognition devoid of hunger, fear, or ambition. Does such a system still appear threatening, or merely incomprehensibly different? Perhaps our greatest challenge is to abandon the comfortingly familiar mirror in which we recognize our own anxieties, acknowledging instead that intelligence detached from biology may be stranger—and far less concerned with us2—than we ever dared imagine.