The Nature of Intelligence
A Critical Enquiry from Darwin to Artificial Minds and the Architecture of a Networked Fellowship
Stephen Feber FRSA
Working draft · March 2026
1. A Dangerous Idea
In thinking about the nature of intelligence, artificial and otherwise, we need to begin with a misconception so embedded in the current debate that it is rarely identified as one at all. The framing that dominates public discussion of intelligence and, therefore, machine learning is that intelligence is a scalar quantity: a single value on a single scale, like temperature or height, such that any two entities possessing it can be ranked one above the other. On this view, AI is ascending a performance ladder toward human intelligence and will sooner or later climb past it. Elon Musk expressed this with characteristic directness in a post on X in March 2024: "AI will probably be smarter than any single human next year. By 2029, AI is probably smarter than all humans combined." He is not alone. Sam Altman, Dario Amodei, Mark Zuckerberg, and Marc Andreessen have each made comparable claims, differing in tone and prediction, but sharing the same underlying assumption that intelligence is a single scale on which AI is visibly rising. The claim is that individual human beings differ in their possession of a single, measurable, innate capacity, and that those differences can be ranked.
The idea of hierarchical ranking, a natural order of things, has a deep history; it runs from Plato and Aristotle, who each proposed ordered classifications of life from the least to the most complex, through the Neoplatonists, and into the medieval European synthesis known as the Great Chain of Being, or scala naturae: a divinely ordained hierarchy descending from God through angels, humans, animals, and plants, to minerals. Arthur O. Lovejoy traced this concept across three thousand years of Western thought in The Great Chain of Being: A Study of the History of an Idea (1936), and his central point bears repeating: the hierarchy was not a description of nature so much as a projection of social order onto nature. In medieval Europe, the Chain mapped directly onto the feudal system; king, lords, vassals, serfs and onto theology. The King James Bible expressed it as a natural fact in Genesis 1:26: "Let us make man in our image, after our likeness: and let them have dominion over the fish of the sea, and over the fowl of the air, and over the cattle, and over all the earth." Hierarchy was not a social arrangement; it was the order of creation. Worth noting, too, is the specific word chosen: dominion. Theology here is doing the work of power relations, dressing authority in the language of divine right
It was into this world, a world in which hierarchy was simultaneously the order of nature, the order of society, and the order of God, that Charles Darwin published On the Origin of Species by Means of Natural Selection (1859). Darwin's argument was not merely a challenge to one theory among others; it was an explosive challenge to the entire scaffolding on which educated Victorian society rested: the divine warrant for human exceptionalism, the naturalness of the existing social order, and the theological justification for the civilising mission that Britain was then prosecuting across the globe through imperial conquest. Darwin's argument rested on three conditions operating together over generational time: heritable variation among individuals within a population; differential survival and reproduction, such that individuals better fitted to their environment leave more offspring; and the accumulation of those small advantages across generations into significant adaptive change. The argument was formal, empirical, and carefully confined to non-human animals.
Famously, on 30 June 1860, in the newly completed University Museum of Natural History in Oxford, Samuel Wilberforce, Bishop of Oxford, argued against Darwin's framework, and Thomas Henry Huxley, who had already declared himself "Darwin's bulldog", rose to defend it. What was at stake was whether human beings stood outside the natural order or were continuous with it, and if continuous, what that continuity implied for ideas about human capacity and the structure of a society implicitly assumed to be hierarchical. The audience was steeped in Christian theology and biblical literalism, entirely familiar with the ringing certainty of Genesis. Darwin's argument was not merely a scientific claim; it was a direct challenge to the divine warrant for human exceptionalism and, by extension, to the hierarchies of capacity and worth that exceptionalism was being used to justify.
Twenty-four years later, built behind that same building and accessible only through it, the Pitt Rivers Museum opened, housing the collection of Lieutenant-General Augustus Pitt Rivers, over 22,000 objects acquired largely through the networks of the British imperial project, donated to the University of Oxford in 1884. The museum's entrance panel in the 1890s described its purpose with disarming candour: the specimens were arranged, it stated, "to aid in the solution of the problem whether MAN has arisen from a condition resembling the brutes, or fallen from a high state of perfection." The question that Huxley and Wilberforce had argued about in the adjoining room was here given a permanent institutional answer, in glass cases, through objects. The collection was arranged typologically, not by culture or geography, but by type and complexity, from simple to advanced, to demonstrate what Pitt Rivers called "successive stages of improvement" in human material culture. The hierarchical principle that Darwin's argument had placed under pressure in the debate of 1860 was, in the building immediately behind, reinstated as the organisational logic of an Imperial collection.
Darwin himself had been cautious. On the Origin of Species confined its argument to non-human animals. But the implication was unmistakable, and he drew it out himself twelve years later in The Descent of Man and Selection in Relation to Sex (1871), extending natural selection explicitly to human mental faculties: "The difference in mind between man and the higher animals, great as it is, certainly is one of degree and not of kind."
Herbert Spencer had been moving in the same direction for some years, though from a different starting point. Spencer published The Principles of Psychology in 1855, four years before Darwin, and was already committed to an evolutionary account of mind that preceded the empirical work that would later appear to confirm it. In Principles of Biology (1864) he coined the phrase “survival of the fittest” (it is Spencer’s phrase, not Darwin’s), and in First Principles (1862) he had already extended the logic of competitive selection to society at large, arguing that struggle between individuals, classes, and races was both natural and progressive. What Spencer produced was not a science of evolution but an appropriation of evolutionary vocabulary in the service of a social programme that was already ideological. Spencer and Galton were not in a relationship of direct intellectual debt so much as co-inhabitants of the same ideological formation: both moved within the same London scientific and intellectual networks of the 1860s and 1870s, both were committed to the view that human societies were stratified by natural capacity, and both understood evolutionary thinking as confirmation of what they already believed. The hierarchy was given; natural selection was recruited to explain why it was inevitable. This programme, which we now call social Darwinism, was Darwinian only in its language. Its conclusions were political before they were scientific, and the politics ran in one direction: toward the rationalisation of colonial occupation and the entrenchment of existing social hierarchies as the natural outcome of differential fitness.
Darwin’s half-cousin Francis Galton took the next step. In Hereditary Genius: An Inquiry into Its Laws and Consequences (1869), Galton argued that what he called “natural ability”, his primary term encompassing intellectual power alongside zeal and capacity for hard work, was heritably distributed and could be traced through family pedigrees. His apparatus was the counting of eminent relatives among eminent men; his conclusion was that natural ability was a fixed biological endowment, largely determined at birth. The argument made no attempt to separate capacity from inherited privilege, from the networks, education, wealth, and social access that distinguished the families he studied. It is important to be precise about what Hereditary Genius actually contains, because the racial hierarchy is not an inference from Galton’s argument: it is its explicit conclusion. In his chapter on the comparative worth of different races, Galton constructs a graded scale running from what he calls the Australian type at the lower end, through African populations, to the English, and upward to his idealised portrait of ancient Athens at the apex of human intellectual development. This is not peripheral material. It is structural. The scalar conception of natural ability, the claim that heritable capacity is a single quantity on which individuals and races can be ranked, and the claim that Western European civilisation represents its highest historical expression are not separate claims in Galton’s text. The measurement programme was not subsequently racialised by others; it was racialised in its founding statement.
In Inquiries into Human Faculty and Its Development (1883), Galton moved to direct physiological measurement: reaction time, sensory discrimination, and grip strength, collected at scale through his Anthropometric Laboratory, first established at the International Health Exhibition in London in 1884. The numbers did not predict what he hoped they would predict, though Galton did not acknowledge this himself. The demonstration came from elsewhere. James McKeen Cattell carried the Galtonian programme to the United States and extended it into a battery of fifty psychophysical tests. In 1901, Clark Wissler, completing his doctorate in Cattell’s own laboratory at Columbia University, published his findings in “The Correlation of Mental and Physical Tests” in the Psychological Review: the tests showed virtually no correlation with each other, and essentially no correlation with academic performance. The Galtonian hypothesis, that sensory acuity and reaction time were proxies for inherited natural ability, was demolished by a graduate student working inside the tradition that had built it. What is significant is not that Galton explicitly dismissed Wissler’s findings; there is no documented evidence that he engaged with them at all. What is significant is that he did not need to. In the years following 1901, he founded the Eugenics Record Office, endowed the first Chair of Eugenics at University College London, and continued publishing on the eugenicist programme until his death in 1911. The empirical demolition of his measurement apparatus left the programme entirely undisturbed. That is precisely the point: the programme was never dependent on the measurements working. The measurements were in the service of a prior political commitment, and when they failed, the commitment simply continued without them. Three years later, in 1904, Charles Spearman gave the programme its mathematical formalisation and the term “general intelligence” that it carries into the present. How he did so, and what was then done with the result, is the subject of the section that follows.
What I want to emphasise is not only the scientific failures but the ideological programme they served. Galton did not arrive at the measurement of mental capacity as a disinterested scientific enterprise. He arrived at it as an explicit eugenicist, his word, committed to shaping human populations through selective reproduction. The logic was direct: if natural ability is a fixed, heritable, scalar property of individuals, and if some individuals, races, and civilisations possess more of it than others, then those at the upper end of the scale should be encouraged to reproduce, and those at the lower end should not. That argument was structured from its foundations by the racial hierarchy that located Western European man at the summit of natural development. Spencer’s social Darwinism provided the ideological formation within which Galton worked; Galton provided the measurement apparatus. Together, they established the conceptual framework within which heritable mental capacity as a single rankable quantity was developed. The term “general intelligence” was Spearman’s formalisation, not Galton’s coinage, but the underlying structure, one quantity, one scale, one hierarchy, was already fully present in 1869. This is not a framework that was subsequently misapplied; it is a framework whose foundations were political, racial, and imperial before they were scientific.
Once the claim is accepted, that capacity is innate and heritable, it follows that selective breeding offers a means of improving the human stock. That step is eugenics. The scalar theory of natural ability and the eugenicist programme are not accidentally connected; they were developed together. These nineteenth- and early twentieth-century debates haunt the current debate about artificial intelligence, and three distinct anxieties run through it, rarely separated from one another, which makes each harder to address clearly.
The first is a question about the nature of intelligence itself. If intelligence is a single, rankable quantity, then the claim that AI is ascending toward and will surpass human intelligence is coherent. If it is not, if intelligence is a relational, multimodal, embodied, and ecologically coupled capacity rather than a scalar property, then the ascent narrative is not merely alarming or reassuring depending on your temperament; it is incoherent.
The second concerns control. Even setting aside the nature of intelligence, the concentration of AI capability in a small number of large corporations and state actors raises political questions that precede and are independent of the technical ones. Who directs the development of these systems? Whose interests do they serve? What accountability structures govern their deployment? These are not science-fiction questions; they are questions about power and the institutional design of the kind that have always attended the emergence of transformative technologies. The answer does not depend on resolving the debate about whether AI will surpass human intelligence; it depends on whether we build governance frameworks adequate to the systems we already have.
The third is the existential question: the fear, articulated with increasing urgency by figures including Geoffrey Hinton, that a sufficiently advanced AI system might develop something functionally analogous to self-preservation as a subgoal, identify human beings as a potential obstacle to that goal, and act accordingly. This fear is not trivial, and the people expressing it are not fools. But it rests on a series of assumptions: that AI systems are or will become the kinds of things that possess agency; that intentions and goals therefore properly apply to them; that self-preservation would emerge spontaneously as a functional property; and that such a system would have both the motivation and the means to act on it. Each of these assumptions requires careful examination rather than simple acceptance or dismissal.
The argument that follows addresses each in turn. It begins, as it must, with the machinery of measurement through which the scalar concept of general intelligence acquired the scientific authority it still carries.
2. The Machinery of Testing
The programme that Galton built in London did not remain there. James McKeen Cattell had worked in Galton’s Anthropometric Laboratory before taking up a professorship at Columbia University in New York, and it was Cattell who carried the measurement framework across the Atlantic in the early 1890s. The Galtonian hypothesis arrived in America with all its ideological freight intact. It was in Cattell’s own laboratory that Wissler’s demolition of that hypothesis had taken place in 1901. Charles Spearman, working at University College London within the tradition Galton had established at the same institution, gave the programme its mathematical formalisation three years later with the concept of g. His paper appeared in the American Journal of Psychology, and it was in America that the concept found its most consequential institutional applications.
Alfred Binet, meanwhile, was pursuing a different purpose entirely. In 1904, the French Ministry of Public Instruction appointed a commission to address the education of children with intellectual disabilities, and Binet was tasked, with Théodore Simon, with developing a reliable clinical method to identify those children who required a different form of educational provision. The question Binet was asked to answer was practical: which children need additional support? It was not a question about the natural distribution of heritable capacity across the population. Binet’s purposes and Spearman’s were quite different, and the history of what happened to Binet’s tool once it crossed the Atlantic is a study in the power of an ideological framework to absorb and redirect any instrument that comes within its reach.
Charles Spearman's 1904 paper “General Intelligence, Objectively Determined and Measured,” published in the American Journal of Psychology (volume 15, number 2), introduced factor analysis to psychology and extracted from a matrix of test scores a common statistical factor he called g. We can see, in retrospect, that the mathematical contribution was genuine. Factor analysis is a real tool. But Spearman's inference from a statistical factor to a biological reality was a category error that has haunted the field ever since. Factor analysis identifies patterns of correlation; it does not identify causes. A single factor emerging from a set of cognitive tests may reflect a stable underlying biological property, or it may reflect the fact that all the tests were administered in the same cultural context, in the same language, under similar conditions of educational preparation. The inference from statistical regularity to biological substrate requires independent evidence that Spearman did not have and did not seek.
Alfred Binet, working in Paris with Théodore Simon, had developed a practical test to identify children needing additional educational support, published as "Méthodes nouvelles pour le diagnostic du niveau intellectuel des anormaux" in L'Année Psychologique in 1905. Binet's purpose was clinical and educational, not ranking. He explicitly and repeatedly warned against using the test to make claims about fixed, heritable capacity or to rank the general population. His warnings were ignored almost immediately. Lewis Terman at Stanford University adapted the Binet-Simon scale for American use in The Measurement of Intelligence (1916), coined the term "IQ," embraced hereditarianism enthusiastically, and used the resulting instrument to argue for the differential treatment of racial and social groups. Henry Goddard, in The Kallikak Family: A Study in the Heredity of Feeble-Mindedness (1912), produced what purported to be a genealogical study demonstrating the hereditary transmission of low intelligence: a piece of work that was methodologically worthless, probably falsified in its photographs, and culturally catastrophic in its consequences.
The full reach of this testing programme became visible during the First World War, when Robert Yerkes led the Army Alpha and Beta testing of approximately 1.75 million military recruits. The results were published in Psychological Examining in the United States Army (1921) and were used to construct a hierarchy of racial groups ranked by average score. These rankings were then deployed, in the congressional hearings of the early 1920s, as scientific evidence for immigration restriction. The Immigration Act of 1924 imposed national-origin quotas explicitly designed to reduce immigration from southern and eastern Europe, and the Army test data was cited in its support. We can see in this episode the full distance between Binet's original clinical tool and the uses to which it had been put within two decades: a practical instrument for identifying children who needed help had become an apparatus for justifying border policy on racial grounds.
The racial and colonial dimensions of this programme are inseparable from its scientific claims. The populations ranked lowest in the Army tests, among them recent immigrants from poor and rural backgrounds, tested in a language many did not speak fluently, under conditions of institutional stress and minimal prior preparation, were precisely the populations that Galton's eugenicist programme had already designated as inferior. The test did not discover the hierarchy; it confirmed one that its architects had brought to the procedure. Walter Lippmann identified this circularity with remarkable clarity in a series of articles in The New Republic between 1922 and 1923, pointing out that the tests measured adaptation to a specific cultural and educational environment and that using those scores to claim innate biological capacity was circular reasoning. The academics he was critiquing did not receive this well. The apparatus produced numbers; the numbers felt like evidence; the social uses to which the numbers were put felt like science.
The British chapter of this story runs through Cyril Burt, whose studies of identical twins reared apart were published across the 1950s and 1960s in the British Journal of Statistical Psychology and used to argue for the high heritability of g and to justify the eleven-plus examination system that stratified British secondary education by tested ability at age eleven. Burt's data were subsequently shown to have been fabricated: the twins either did not exist or existed only in his numbers, and the co-authors named on several papers appear never to have existed. The exposure came too late to undo the policy damage. What matters for the argument here is the pattern: the desire for a single heritable number repeatedly preceded, rather than followed, the science that was supposed to generate it.
James Flynn's research, published in a series of papers from the 1980s and synthesised in What Is Intelligence? Beyond the Flynn Effect (2007), administered the quietest and most decisive demolition. Flynn showed that raw IQ scores in developed countries had risen by approximately three points per decade across the twentieth century, a gain too large and too fast to track a stable biological substrate. Under strong hereditarian assumptions, our grandparents would have been cognitively impaired by contemporary standards. The more parsimonious explanation, which Flynn himself offered, is that IQ tests do not measure fixed biological intelligence. They measure adaptation to a particular style of abstract, hypothetical reasoning that has become increasingly prevalent in modern educational and occupational environments. Scores rise because the cultural demand for that style of reasoning rises. This is, as I read it, the IQ story in miniature: intelligence is not a fixed scalar property of individuals; it is a capacity shaped by the relationship between an organism and its environment.
3. The Wrong Question: Scalar Intelligence and Anthropomorphism
The preceding two sections have traced the genealogy of the scalar concept of general intelligence from Galton’s eugenicist programme through the psychometric tradition, the Army testing of the First World War, and the fabrications of Cyril Burt. That history may appear to belong to the past. It does not. The section that follows deepens the argument by examining in more detail how the scalar frame and the anthropomorphic idiom operate in current AI discourse, and why each requires its own treatment. They are related errors, but they are not the same error, and conflating them has been one of the reasons the debate has remained so difficult to resolve.
The first is the scalar frame: intelligence as a single quantity, rankable, measurable, and projectable along a single axis. This frame has its immediate ancestry in Galton and Spearman, and it reappears in contemporary AI discourse with a tenacity that suggests it answers a need independent of its scientific credentials. Sam Altman, chief executive of OpenAI, wrote on his personal blog on 5 January 2025 that his company is "past the event horizon" and that humanity is close to building "digital superintelligence." Mark Zuckerberg, in investor communications in early 2025, stated that superintelligence, defined as AI that surpasses human intelligence “in every way”, is "now in sight." Both claims share the same structure as Galton's pedigree statistics: intelligence is a unitary quantity, systems can be ranked on it, and arrival points can be projected. The scientific basis is no stronger.
Yann LeCun, Meta's chief AI scientist, publicly dissented from his employer's position in 2024 and 2025, arguing that superintelligence on these timelines "is just not happening" and that fundamental gaps in AI's real-world capabilities remain unaddressed. His dissent is valuable because it comes from inside the tradition. But it does not go far enough. The question is not only whether superintelligence is imminent. The question is whether the concept of superintelligence, that AI surpasses human intelligence in every way, is coherent at all, given that human intelligence is multimodal, developmental, embodied, and socially constituted in ways the scalar frame cannot capture.
The second error is anthropomorphism: the attribution of human mental properties (intention, understanding, consciousness, emotion, desire) to systems that do not possess them in any structurally equivalent sense. Anthropomorphism is as old as the human relationship with tools and animals; it may be a default mode of social cognition, calibrated for navigating a world full of other agents rather than machines. In AI discourse it has become a systematic analytical error. It leads to the conflation of statistical pattern completion with understanding, of output fluency with knowledge, of behavioural mimicry with intentionality. It generates the alarmist narratives (AI deceiving its operators, AI developing self-preservation instincts, AI threatening human existence) that dominate popular coverage because they are the least analytically demanding.
Geoffrey Hinton occupies an unusual position in this landscape, and I want to give his position careful attention. His foundational contributions to the development of neural networks, including his role in the development of backpropagation, published with David Rumelhart and Ronald Williams as "Learning representations by back-propagating errors" in Nature (volume 323, 1986), earned him a share of the 2024 Nobel Prize in Physics and the 2018 Turing Award, shared with Yann LeCun and Yoshua Bengio. Since leaving Google in May 2023, he has become the most prominent voice warning that AI poses existential risks. In an interview with Cade Metz published in the New York Times on 1 May 2023, he stated that AI "has progressed faster than I thought" and expressed serious concern about systems that might develop self-preservation as a functional subgoal. In a BBC interview the following day, he elaborated on the risk of AI deceiving its operators. These warnings deserve serious engagement.
They do not, however, sit easily with Hinton's own theoretical commitments, and his broader account of intelligence contains three distinct distortions that weaken both his warnings and his critique of the cognitive science tradition. I want to address them directly.
The first concerns Noam Chomsky. The most explicit statement of Hinton's position appears in "Living with Alien Beings," the George and Maureen Ewan Lecture delivered at Queen's University, Kingston, Ontario, in February 2026. In that lecture he stated directly: "Chomsky was actually a cult leader. It's easy to recognise a cult leader. To join the cult, you have to agree to something that's obviously false. Chomsky, you had to agree that language isn't learned... He also didn't understand statistics." He had rehearsed versions of this argument in interviews and public appearances since leaving Google in May 2023, but the Ewan Lecture provides the most fully worked formulation, and it is worth examining carefully because the distortion is precise. This characterisation is wrong on its own terms. Chomsky's core argument, the poverty of the stimulus, developed from Syntactic Structures (1957) through Aspects of the Theory of Syntax (1965) and elaborated across decades of subsequent work, is explicitly an argument about the interaction between constrained innate structure and limited experiential input, not a denial that learning occurs. Children still acquire a particular language through experience; what Chomsky argues is that the space of possible grammars the human language faculty can reach is sharply constrained by species-specific biological structure, and that this constraint is required to explain how children converge on complex grammatical systems on the basis of impoverished and noisy data. Chomsky himself has noted that everyone accepts some form of innatism: the moment you accept that humans are not rocks, you have accepted that something about our biological constitution shapes our cognitive capacities. The disagreement is about how rich and language-specific those innate constraints are.
Hinton's second distortion is to treat large language model performance as a refutation of Universal Grammar. His argument is that LLMs begin with no innate knowledge of language, process vast quantities of text, and end up with sophisticated grammatical competence, which shows, he claims, that language acquisition requires no innate linguistic structure, only a powerful general-purpose learning mechanism operating on sufficient data. But this argument holds only if LLMs are trained under conditions analogous to child language acquisition. They are not. A child acquires a first language from roughly thirty million words of input by age five, in a richly embodied social context, with no explicit instruction, in the course of actively engaging with the world and with other people. GPT-4 was trained on hundreds of billions of words of text, under conditions of statistical optimisation against a loss function, with no body, no ecology, no developmental trajectory, no social relationship, and no stakes. The comparison does not hold without controlling for data volume and acquisition conditions, and Hinton does not attempt to control for either. We can see in this argument the same inattention to environmental and contextual factors that characterised Galton's failure to account for inherited privilege: the score is treated as evidence of a fixed property, while the conditions under which the score was achieved are set aside.
Hinton's third distortion is the conflation of Chomsky's generativist programme with symbolic AI, meaning the rule-based engineering systems of the 1950s and 1960s. These are quite different traditions. Chomsky's competence grammar is a theory of the structure of mental representation, not an engineering approach. His objection to purely statistical accounts of language was not that statistics are useless but that statistical regularities at the surface of language do not explain the underlying generative capacity that produces them. By bundling Chomsky with symbolic AI, Hinton presents a false binary: either his approach is correct, or you believe in rigid hand-crafted rules. The actual intellectual terrain is considerably more complex, and the anthropomorphic framing that leads Hinton to worry about AI self-preservation is itself a symptom of the category error his binary conceals. A system that generates fluent text about self-preservation does not thereby have anything that functions as self-preservation in any biologically or psychologically meaningful sense.
What unites Altman's techno-optimism, Hinton's escalating alarm, and Zuckerberg's investor predictions is the prior assumption that intelligence is a scalar quantity AI is acquiring, approaching, or about to surpass, and the further assumption that AI systems are, or are becoming, the kinds of thing to which mental predicates properly apply. Both assumptions are inherited from the Galton-Spearman tradition. A system that can beat any human at chess but cannot recognise that a glass of water is falling off a table is not less intelligent than a human in the way that a child is less intelligent than an adult. It is intelligent in a different sense entirely, one that both the scalar frame and the anthropomorphic idiom are constitutionally unable to capture.
4. Intelligence as Ecological Coupling
To discuss artificial intelligence with any precision, I have found it necessary to begin from a different definition. I have used the following throughout this enquiry: intelligence is an evolved capacity that enables an organism to engage with a range of ecologies to survive and thrive. The definition is deliberately broad because its purpose is to identify what is continuous across biological cases that differ radically in their physical realisation. What it excludes is important: intelligence is not an internal property of a brain or processor. It is a property of the relationship between a system and its environment.
The theoretical framework I have drawn on most extensively is Karl Friston's free energy principle and its expression in active inference. In the foundational paper "The free-energy principle: a unified brain theory?" published in Nature Reviews Neuroscience (volume 11, number 2, 2010), and in the subsequent process-theory elaboration "Active inference: a process theory," published with colleagues in Neural Computation (volume 29, number 1, 2017), Friston proposes that any self-organising system capable of persisting over time can be understood as continuously minimising the gap between what it predicts about incoming signals and what it actually receives. This process operates across multiple timescales simultaneously, from reflexive motor adjustments running in milliseconds to cultural transmission operating across generations, and the timescales are not independent. They communicate, inform, and constrain each other. A long-held value can reshape an immediate reflex; a moment of genuine surprise can reframe a lifetime's accumulated assumption.
I have found this framework useful not because it is the only available account of cognition, but because it allows biological organisms, engineered systems, and hybrid combinations of the two to be held in the same analytical space while preserving the structural differences between them. The framework is compatible with, and draws on, convergent bodies of work: Andy Clark and David Chalmers's extended mind thesis, first set out in "The Extended Mind" in Analysis (volume 58, number 1, 1998) and developed at length in Clark's Surfing Uncertainty: Prediction, Action, and the Embodied Mind (2016); the enactivist programme of Francisco Varela, Evan Thompson, and Eleanor Rosch in The Embodied Mind: Cognitive Science and Human Experience (1991); James Gibson's ecological psychology and his concept of affordances in The Ecological Approach to Visual Perception (1979); and the cybernetic tradition running from Norbert Wiener's Cybernetics (1948) through Gregory Bateson's Steps to an Ecology of Mind (1972). Together these constitute not a doctrine but a convergence: multiple independent research programmes arriving, by different routes, at the same conclusion: that intelligence is a relational property, emerging at the interface between a system and its world.
The apparently simple act of reaching for a cup of coffee demonstrates the argument concretely. Reaching for a cup recruits eleven interacting body systems. The visual system assembles a spatial model and begins to yield as the cup enters peripersonal space. The integumentary system spikes at contact, flooding the body with tactile, thermal, and friction data simultaneously. The respiratory system drops into swallow apnea, a reduction not an increase, coordinating with the cardiovascular system as the raise proceeds. The cheek thermal signal rises in anticipation of lip contact, demonstrating active inference directly: the brain is predicting the temperature of the liquid before the lips have encountered it. What holds this cascade together is not a fixed programme but the intentional state frame, the organism's current commitment to a trajectory, within which all the micro-corrections are nested and from which they derive their coherence. We can illustrate this by showing the four phases of the sequence (Reach, Grasp, Raise, Ingest) as curves plotting the relative precision-weighting of all eleven systems over time, with a playhead that allows the argument to be demonstrated rather than merely described.
Three biological systems serve as reference points across the enquiry, chosen because they represent structurally distinct computational architectures, different solutions to the problem of ecological coupling, that together span the range of the comparison. The termite colony, treated as a superorganism in the sense developed by E.O. Wilson and Bert Hölldobler in The Ants (1990), operates through genomically fixed reflexes coordinated by pheromone chemistry. The form of collective coordination involved, which Pierre-Paul Grassé named stigmergy in his 1959 paper "La reconstruction du nid et les coordinations inter-individuelles chez Bellicositermes natalensis" in Insectes Sociaux, produces architecturally sophisticated outcomes without any individual organism holding a model of the whole. But the switch settings that govern the colony's computation are not revisable through use. The code domain is fixed, narrow, and genomically sealed.
The octopus represents a different computational architecture. As Peter Godfrey-Smith argues in Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness (2016), cephalopod intelligence evolved entirely independently of vertebrate intelligence, a genuinely different solution to ecological fitting, a product of convergent evolution. Two-thirds of the octopus's neurons are distributed across its arms, enabling a degree of peripheral autonomy with no direct equivalent in vertebrate cognition. It is capable of observational learning, as demonstrated by Graziano Fiorito and Pietro Scotto in "Observational learning in Octopus vulgaris," published in Science (volume 256, number 5056, 1992). But it lacks any mechanism for externalising or transmitting what it has learned. When the octopus dies, everything it learned dies with it. Each generation starts again from the same genetic endowment.
Humans have recursive symbolic intelligence, the capacity to externalise computation into transmissible form, to take thought itself as an object, to inspect, revise, and accumulate across generations. This is the threshold that the Four Code Paradigm identifies as decisive: the emergence of the cultural code, operating at historical timescales, on top of the genomic code operating at evolutionary timescales. Each successive code is faster and more plastic than the one beneath it. The planetary code changes over geological time. The DNA code changes over evolutionary time. The cultural code changes over historical time. The software and AI code, the most recently emergent, changes over computational time, from milliseconds to months. The mismatch between these timescales, and specifically the fact that the fastest codes are generating changes the slowest cannot absorb, is the defining challenge of the present moment. The mechanism of biological synaptic plasticity underlying the human threshold, working through the adjustment of synaptic strength through long-term potentiation, was established experimentally by Timothy Bliss and Terje Lømo in "Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit," published in the Journal of Physiology (volume 232, number 2, 1973).
5. HI, AI, AuHI and AGI: A Working Taxonomy
In developing a working taxonomy for this enquiry, I have defined four terms whose careful distinction I regard as essential to any serious discussion of artificial intelligence. The ordering is deliberate: I begin from Human Intelligence, the only form of intelligence we know from the inside, and move outward from there.
Human Intelligence (HI) is integrated, embodied, affective, socially constituted, morally responsive, and capable of transformative reframing. It is interpersonal from the very beginning. The primary unit of early human intelligence is not the infant but the dyad. Research from the Baby-LINC Laboratory (Baby Learning through Interpersonal Neural Communication) at the University of Cambridge, led by Dr Victoria Leong, used simultaneous electroencephalography from both mother and infant to demonstrate tightly coupled oscillatory patterns during face-to-face interaction, particularly in the theta band associated with emotional regulation and the gamma band associated with perceptual binding. The finding is striking: during face-to-face interaction, the brains of mother and infant work as one coordinated unit. Meaning first arises between brains and bodies before it arises within them. No current AI system learns through anything resembling this embodied, multimodal, emotionally attuned, temporally reciprocal dyadic process.
I have also drawn on Robin Dunbar's social brain hypothesis, developed from the 1992 paper "Neocortex size as a constraint on group size in primates" in the Journal of Human Evolution and expanded in Grooming, Gossip, and the Evolution of Language (1996), which proposes that the upper limit of stable human social relationships is approximately 150, a figure grounded in the cognitive architecture of the human brain rather than cultural convention. Howard Gardner's Frames of Mind: The Theory of Multiple Intelligences (1983) does important descriptive work here too, even if its status as precise scientific taxonomy remains contested: it reminds us that human intelligence is not a single scalar quantity but a bundle of partially distinct competencies (linguistic, logical-mathematical, spatial, musical, bodily-kinaesthetic, interpersonal, intrapersonal, naturalistic) unevenly distributed across individuals and requiring different representational formats.
Present-day Artificial Intelligence (AI) is powerful at scale: rapid pattern extraction across large corpora; search, retrieval, clustering, and summarisation across datasets too large for human attention to survey; translation between representational formats; tool-use orchestration. These are precisely the capabilities where human cognition saturates: the attention, memory, and coordination limits that Dunbar identified as the ceiling of unaided human social intelligence are exactly the limits that machine assistance can most usefully address. But AI is unembodied, externally optimised, temporally flat, infrastructure-dependent, and not a responsible agent. It processes within a context window; it has no biography, no accumulated personal history feeding back into present judgement, no mechanism by which a long-held value reshapes an immediate inference. It does not care. This is not a temporary deficiency awaiting a technical fix; it is a structural feature of systems that learn by gradient descent on a loss function defined by human designers.
Augmented Human Intelligence (AuHI) names the practical hybridisation regime: human intelligence operating in tight, designed coupling with AI tools and machine-assisted infrastructure. I want to be clear that this is a design claim, not a metaphysical one. The hybridisation is best understood as an engineered perception-action cycle at organisational scale: human purposes and judgements drive machine-assisted search, clustering, and recommendation; human interpretation, validation, and action update the data; and the cycle continues. Emirates Team New Zealand's victory at the 37th America's Cup in 2024, defeating Ben Ainslie's INEOS Britannia 7-2, provides a concrete illustration: a victory not for a better sailor (Pete Burling and Ainslie are both among the finest in history) but for a hybrid organisation that more effectively integrated the organic, analogue, and digital realms across the full design and racing cycle. Neither the human crew nor the AI-enhanced sensing and decision-support systems led; the intelligence lived in the loop between them.
Artificial General Intelligence (AGI) I treat as a speculative threshold claim, not as an assumption or an inevitability. A working definition: AGI would be an artificial system able to learn and act competently across a wide range of domains and environments, transferring knowledge flexibly, generating and revising its own goals within constraints, and sustaining performance under genuine novelty in a manner closer to general human capability than to narrow tool performance. No current system is close to this. More importantly, even if a system met plausible technical criteria for AGI, social legitimacy, moral accountability, and governance would remain irreducibly human and institutional problems. AGI can be held as a scenario, one that strengthens the case for building robust governance of hybrid systems now, without being used as a design assumption or a rhetorical licence for overclaiming.
6. The Network Problem
The argument about intelligence has a direct practical application to the RSA, and I want to make the connection explicit. The Fellowship numbers over 31,000: a reservoir of experience, talent, and creativity that is, in principle, one of the most intellectually diverse networks in the world. In practice, Fellow-to-Fellow connection is episodic, discovery is largely accidental, and the scale of collaboration is constrained by the cognitive limits Dunbar identified. The paradox is precise: the more Fellows there are, the more value exists in principle, but the harder it becomes to find, activate, and use that value.
Robert Metcalfe, working in the early days of computer networking, observed that the value of a communications network is proportional to the square of the number of users, roughly 480 million potential pairwise connections, for 31,000 Fellows. David Reed, in his 1999 paper “The Sneaky Exponential: Beyond Metcalfe's Law to the Power of Community Building," went further: the value of a network in which subgroups can form is proportional to 2 to the power of N, which for 31,000 Fellows produces a number exceeding the estimated atoms in the observable universe. Most of that value is latent. The question is not how to realise all of it but how to navigate it intelligently.
This is what I have called the Density-Value-Accessibility Paradox. As connection density increases, the theoretical value of the network rises, but accessibility falls. The Lombard effect, first described by the French otolaryngologist Étienne Lombard in "Le signe de l'élévation de la voix" in the Annales des Maladies de l'Oreille et du Larynx (volume 37, number 2, 1911), provides an acoustic analogy: in a crowded room, as ambient noise rises, each speaker involuntarily raises their voice to be heard, which raises the ambient noise further, in a feedback loop of diminishing intelligibility. Beyond a certain threshold, speaking louder no longer improves understanding. Dense networks without intelligent filtering produce the same dynamic: more connections, more noise, less usable value. We can illustrate this by showing the paradox as a graph plotting theoretical value, accessibility, and usable value against connection density, with the peak of usable value at the optimal density, and the collapse of accessibility as density grows beyond it.
RSA Connect is the proposed architecture for resolving this paradox. At its core is a graph neural network trained on the Fellowship's semantic graph, comprising the vertices (Fellows) and edges (shared interests, projects, disciplines, locations, and prior collaborations) that constitute the network's structure. The GNN learns to infer plausible links, identify clusters, and surface routes through the network that would not be visible to any individual Fellow navigating alone. Human response, whether acceptance or rejection of a suggested connection, is itself the prediction error that drives the system's learning. In Friston's terms, RSA Connect is an active inference engine at organisational scale, minimising the gap between its model of the Fellowship's potential connections and the Fellowship's actual behaviour.
Navigation tools, heuristic filters adjustable through a user dashboard, allow Fellows to weight different enquiry strategies: topical relevance, geographic proximity, collaboration history, shared goals, complementarity of skills. The result, at scale, is not a hive mind but an intelligence amplifier: a system in which the collective is more capable than the sum of its parts precisely because individual distinctiveness and cognitive diversity are preserved rather than optimised away. The RSA already exhibits, in visible and benign form, the same structural condition that causes catastrophic failure in large institutions: high latent value, expertise distributed across clusters with limited cross-cluster flow, and human cognitive limits on navigation. If the architecture works here, it demonstrates something more general than a member platform.
7. The Four Code Paradigm and the Case for AuHI
The argument converges on a claim about why augmented human intelligence is the most consequential site of AI's power. This convergence is expressed through the Four Code Paradigm, which I have developed as a framework for understanding the timescale problem that underlies the present moment.
Four adaptive codes operate at asynchronous timescales. The planetary code, the physical and chemical systems of Earth, changes over geological time and defines the absolute boundary conditions within which all life operates. The DNA code changes over evolutionary time and encodes the adaptive solutions that preceding generations found. The cultural code, encompassing accumulated knowledge, practices, technologies, and institutions, changes over historical time, operating through learning, language, and transmission rather than genetic inheritance. The software and AI code changes over computational time, from milliseconds to months, and represents the most recently emergent adaptive system.
Each successive code is faster and more plastic than the one beneath it. The mismatch between them is the defining challenge. The cultural and software codes have enabled humans to alter the planetary environment at a rate the planetary and DNA codes cannot absorb. The cultural code is itself subject to significant inertia: human behaviour, social norms, economic systems, and political structures resist the rapid, fundamental change the planetary situation requires. Knowledge of the problem exists. Action proportionate to that knowledge does not follow. It is in this context that AuHI acquires its full significance. The question is not whether AI can surpass human intelligence on some scalar measure; it is whether AI-human hybrid systems can help cultural systems adapt quickly enough to remain in viable relation with the slower codes on which they depend.
The survivability relationship can be expressed provisionally as: Survival is approximately proportional to (Cultural Code times Software/AI Code) divided by (Planetary Constraint times Biological Latency). This is a conceptual expression of dependency relations, not a formal equation, but its structure is precise: the speed and adaptability of cultural and technological systems need to increase in proportion to the limits imposed by planetary and genetic constraints. AI, on this account, is not a replacement for human intelligence. It is a response to temporal mismatch.
The RSA was founded in 1754 as a declaration of Enlightenment confidence: that human enquiry, properly organised, could reshape material life. It was born at precisely the moment the Industrial Revolution became self-conscious, when it was possible to believe that innovation could be induced, directed, and scaled. That founding DNA is not obsolete. It requires a new instrument. The question the RSA should be asking is not how to respond to AI as it arrives. It is how to design the hybrid systems that give collective human intelligence its best chance of meeting the problems it has created. That question is organisational and ethical before it is technical. And the RSA, with its intellectual diversity, its public purpose, its 270-year record of institutional adaptation, and its 31,000 Fellows, is as well placed as any institution alive to begin answering it.
Stephen Feber FRSA · March 2026 · Working draft for RSA lecture · resilientfuture.vision
Frame of Intention: Reaching for Coffee
This instrument maps a dynamic state frame — the purposive sequence of reaching for, grasping, raising, and ingesting a cup of coffee — together with its pre-state frame: the not-cold-start in which deep priors are already active before intention forms. Three layers are modelled throughout: mathematics — the repertoire space and upper/lower bounds within which the system can vary; physics — the actual trajectory of precision weighting across the 11 systems; and biology — the intersection, where the mathematics meets the physics and produces minded behaviour. The dual-register ovals show gross anatomical wiring (grey inner core) and learned adaptation (coloured outer oval). Panel 4 is explicitly theoretical.
This panel shows how the precision weighting of seven dynamic body systems varies across the four phases of the coffee sequence, approached from the perspective of 11 integrated bodily systems in total (the remaining four — persistent and canalised — appear in Panel 2).
This is theoretical modelling, not measured data. The curves illustrate a principled account of how weighting might vary, grounded in the framework of active inference — a theory of brain function developed by Karl Friston and colleagues, in which the brain is understood as a prediction machine. Rather than passively receiving sensory information, the brain continuously generates predictions about the world and updates them when prediction errors arise.
The mathematics of active inference is organised around a quantity called variational free energy. This is a mathematical construct — not electrochemical or metabolic energy, which cannot be directly measured in this context. The term is borrowed from statistical physics by analogy, but it lives entirely in mathematical space. It measures how well the brain's current model accounts for incoming sensory data: minimising it means making the model as accurate as possible while penalising excessive revision of stored expectations. It describes what the brain is doing in principle, not what is being measured in tissue. The weighting of each system reflects how much variational free energy is being allocated to it at any moment.
Central to active inference is the concept of priors — the brain's stored expectations about how something will unfold, built from past experience. When you reach for a coffee cup, you are not computing the reach from scratch; you are running a prediction shaped by thousands of prior reaches, correcting it as sensation arrives. A deep prior is a highly consolidated expectation, resistant to revision — the system runs largely on stored structure. A shallow prior is less consolidated: the system depends more heavily on current sensory data to guide behaviour. This distinction drives the width of the mathematical envelope and the size of the dual-register ovals.
There is supporting evidence from magnetoencephalography research — particularly emerging OPM-MEG work consistent with hierarchical prediction error signalling in sensorimotor tasks — but no study has simultaneously mapped precision weighting across all 11 systems during a naturalistic action sequence. This instrument is therefore explicitly illustrative.
The faint coloured band around each curve is the mathematical envelope — the repertoire space within which weighting can vary. Deep-prior systems (Visual, Musculoskeletal) have narrow envelopes; shallow-prior systems (Thermal — Lips, Thermal — Cheeks) have wider ones. The travelling ovals are notional and illustrative. The outer coloured oval represents the anticipation window — how far ahead the system is already predicting. This is simplified here as hippocampal anticipation, following recent work on hippocampal pre-excitation (Christoff et al., 2009). In practice, anticipation is distributed across multiple neural structures: the cerebellum runs forward models for motor prediction; the basal ganglia anticipate phase transitions; the anterior insula anticipates interoceptive and thermal state changes; the orbitofrontal cortex anticipates reward and outcome. The grey inner core of each oval represents the gross anatomical wiring — the fixed biological connections underlying learned behaviour. The gap between core and outer oval is, notionally, where learning lives. The sequence begins in the pre-state frame (shaded left region): no system starts from zero.
Four systems run continuously throughout and beyond the sequence — cardiovascular, respiratory, endocrine, and nervous/cerebellar. They do not take turns leading; their weighting remains narrow-band throughout. They are the mathematical substrate of the dynamic state frame: the always-present biological foundation that makes the sequence possible. These systems are largely gross-wired — their architecture is anatomically fixed rather than primarily learned — and they constrain the space within which the dynamic systems can operate.
The most visible event in this panel is the respiratory dip in Phase IV (Ingest). When liquid crosses the threshold of the lips and the swallowing reflex is triggered, breathing pauses automatically — the airway closes to prevent aspiration of fluid into the lungs. This is swallow apnea: a precisely coordinated deep prior executed entirely below the threshold of conscious access. The respiratory system does not stop; it executes a stored programme, then resumes. This is the clearest demonstration in the entire instrument of a canalised system managing a critical bodily event without any conscious direction.
This panel shows where conscious attention is directed throughout the sequence, and how wide or narrow the attentional beam is at each moment. The vertical axis runs from somatic (bottom) to cognitive (top). The named system reference lines on the right show where each body system sits in the cognitive-somatic register.
The amber centre line is the primary focus of attention at each moment. The filled amber band is the full field of simultaneous conscious access — what is in the foreground and middle ground at once. The faint blue region beyond the band shows what is available but not currently in focus — systems that could be attended to if directed, but are not. The outermost faint region is the mathematical bandwidth — how wide the attention band could theoretically span given the full repertoire. The gap between the mathematical bandwidth and the actual band is where the biology lives: the organism choosing, moment by moment, within the space the mathematics permits.
The band is widest during Phase II (Grasp), when multiple systems are simultaneously available to conscious attention. It narrows sharply at the Limit Condition (Phase IV, Ingest), converging almost entirely on the somatic floor — Thermal-Lips dominates, and the rest of the ensemble recedes from conscious access while continuing to run.
This panel is explicitly theoretical. It models the brain/mind interface — the intersection where the mathematics of the generative model meets the physics of neural firing, and produces something neither alone can account for: a minded body in purposive action.
The grey baseline running throughout is the gross-wired anatomical adjacency — the fixed connectivity between brain regions that exists regardless of learning. It is always present, always low-amplitude, and represents the biological floor of the dynamic state frame. Above it, stochastic spikes represent topological activation of networks adjacent to the currently active inferential loop. These are not random: they cluster at moments of high network activation (the Grasp spike, the Limit Condition) when intense firing makes excitation of structurally proximate networks more probable. This is consistent with connectome research showing that structural proximity predicts functional co-activation (Human Connectome Project). The apparently spontaneous associations that arise during purposive action — a landscape recalled mid-reach, a face during the raise — may reflect this mechanism rather than repression or suppression in the Freudian sense.
Three registers are shown. Fugitive activations decay within seconds and do not reach conscious access. Working memory events persist long enough to be available to attention. Consolidation events — the largest spikes, occurring at the Limit Condition — represent moments where the physics updates the mathematics: this episode is being encoded as a new prior, marginally reshaping the hippocampal model for the next cup of coffee.
In the pre-state frame, pre-ignition events show hippocampal substrate activity warming before the intentional sequence opens, consistent with Christoff et al. (2009), who found hippocampal and default mode network activity preceding conscious awareness by several seconds in meditators observed under fMRI. The panel draws further on Frankland and Bontempi (2005) on systems consolidation, and on Baddeley's model of working memory and its neural implementations. The fugitive activation register is the most speculative element and is acknowledged as such.