top of page

A GUIDE TO WHY THE INTEGRATION OF DIGITAL INTELLIGENCE IS BECOMING EVER-MORE VITAL

  • 5 days ago
  • 23 min read
crowd surrounding globe of puzzle pieces with text On Many Intelligences

By Phillip Black


There's a strange disconnect at the heart of AI today.


On one hand, we have systems that compress most of the written knowledge of humanity into a few hundred gigabytes of weights. They can converse fluently about quantum field theory, draft a contract and write a sonnet - all before breakfast. 


On the other hand, without the correct harness, those same systems will confidently invent a citation, lose track of a simple rule three turns into a chat, or fail at a task a child finds trivial. 


The intelligence of these systems is, to borrow a now-popular phrase, jagged. Brilliant in one dimension, bewildering in another.


The instinct, especially in commercial settings, is to treat this jaggedness as temporary - and fixable by better engineering. A bigger model, more data, a better fine-tune, a new interface. All are held out as ways to close the gaps. And there’s no doubt that engineering will relieve at least some of the symptoms. But we don’t believe the gaps will be closed entirely. Because to believe they can be eliminated by improving LLMs is to believe that intelligence is one single thing. A monoculture whole that can be contained in a single form.


Which is a problem. Because in reality, it is no such thing at all. Believing otherwise means we start with a fundamentally bad assumption.


The literature - across psychology, neuroscience, AI, philosophy, biology and economics - contains at least a dozen serious, mutually compatible definitions of intelligence. Each one describes something real. Each one suggests a different benchmark. And each one points to a different kind of digital system the world might want to build.


Once you see intelligence as taking many divergent forms two things become clear. First, today's LLMs are genuinely intelligent on several axes. Which explains their utility and consequent popularity. Second, they are genuinely unintelligent on others. And not because the engineers haven't tried hard enough. The gaps are structural.


This short essay is an attempt to lay out a more complete intelligence map, place LLMs on it and sketch what might be next. Not to diminish what current models can do - they are remarkable tools, which are creating real value. But because anyone trying to deploy them to their fullest capacities - or build the next generation of digital intelligences - needs a richer vocabulary than ‘AI’ provides alone.


The map: A dozen definitions (for starters)


A complete and detailed romp through every definition of intelligence would take more time than we have here. That’s not to saying that wouldn’t be illuminating and worthwhile - we’ve done it and it’s time well-spent. But for our purposes here, let’s focus on a broader understanding of the most important forms of intelligence. Each included here captures something the others miss.


Intelligence as prediction. The ability to anticipate what comes next, whether the next word, the next sensor reading or the next move in a game. Predictive accuracy is the workhorse of modern machine learning.


Intelligence as compression. The ability to find latent structure in data so that it can be represented in fewer bits. A system that has discovered the laws of motion can describe a million observations with a handful of equations. Jürgen Schmidhuber, one of the pioneers of modern neural network research, and Marcus Hutter, whose work on universal artificial intelligence has shaped much of theoretical AI, have argued forcefully that this kind of compression is the fundamental essence of learning. Science compresses observations into laws. Mathematics compresses truths into theorems. A grandmaster compresses thousands of positions into usable patterns. All expertise lives here.


Intelligence as goal achievement. A definition coined by Shane Legg - co-founder of DeepMind - and Marcus Hutter again: the ability to achieve goals across a wide range of environments. The emphasis is on generality and adaptability. And in this sense, a thermostat is intelligent in one tiny domain. A human is intelligent across thousands.


Intelligence as adaptation. This is the biological framing. The capacity to adjust behaviour as the environment changes. Of course, beneath the question is always a deeper question: what's driving the adaptation? In biology, the answer is survival. Every adaptive trait we know about exists because organisms with it persisted. And ones without it did not. The intelligence isn't the survival instinct. its the machinery which delivers the survival outcome - the prediction, the learning, the abstraction. Machinery that environmental pressures have shaped over millions of years. Plants, fungi, octopuses, crows, ants and humans are all forms of embodied, adaptive intelligence.


Intelligence as rationality. Choosing well, under conditions of uncertainty and constraint. Herbert Simon's insight - he called it bounded rationality - was that no real mind optimises; lacking the information, time and compute. Instead, it satisfices - leaning on rules of thumb the environment has shaped. Daniel Kahneman mapped how those shortcuts fail in predictable ways, while the Bayesian tradition sets the standard underneath: rational beliefs that cohere, updated in proportion to the available evidence that support them.


Intelligence as learning. The improvement of performance, based on the feedback received from prior experience. Less about what a system can currently do, more about how it gets better.


Intelligence as abstraction. The capacity to form and manipulate concepts - numbers, categories, causes, metaphors. Humans excel here. It's what lets us transfer a lesson from one domain to a completely different one.


Intelligence as causal modelling. The contribution of Judea Pearl, the Turing-Award-winning computer scientist who built much of the modern mathematical theory of causality: the ability to reason about interventions and counterfactuals, not just correlations. What would happen if I did X? is a different question from what tends to follow X? And a different kind of system is needed to answer it.


Intelligence as search. The classical AI definition. Efficient navigation of a space of possibilities. Chess engines, theorem provers and planning systems are all strong examples.


Intelligence as embodied action. Cognition emerging through interaction with a physical environment. A position favoured by roboticists and ecological psychologists who think disembodied intelligence is a category error.


Intelligence as social coordination. Answers and behaviours that can only emerge from collaboration and coordination within social environments. This traverses theory of mind, communication, cultural transmission and coalition-building. Some anthropologists argue this is the primary axis of human intelligence - and that it is more significant than raw individual intelligence. Ants, bees and even plants also exhibit social intelligence.


Intelligence as meta-cognition. The capacity to think about one's own thinking. To know what you know and, often more pertinently, to know what you don't. This also implies the ability to calibrate confidence against evidence, construct arguments that can be examined and defended, and to revise them when the defence fails. And - crucially - to be able to say I don't know when a question is outside what your evidence can support, instead of confabulating something plausible. John Flavell, the developmental psychologist who coined the term meta-cognition in the late 1970s, framed it as knowledge about one's own cognitive processes. In practice it's the difference between generating an answer and standing behind one. The pinnacle of formal intellectual achievement is, after all, the viva - not the production of a thesis but the public defence of it. You don't fully know something until you can be challenged on it and hold your ground for the right reasons. This is the hallmark of expertise, and it's one of the things that most reliably distinguishes a senior practitioner from a confident novice.


Intelligence as efficient encoding. From neuroscience: the brain compresses sensory input to minimise redundancy and metabolic cost. Using this lens, perception itself is a form of compression, in service of action.


There are more we could have included beside these. Multiple intelligences, emotional intelligence, collective intelligence, self-improvement and free energy minimisation. But the above list gives us enough to make a strong start with the analysis.


Why the map matters: a stack, not a competition


It would be tempting to treat these as rival theories, with one of them being the ‘right’ answer. But that is not - absolutely not - how they relate. Let’s peer more deeply to understand why. 


Each of these theories is trying to build an internal model of our world. One that supports successful action. But they are only going to be partially successful. Because they are each looking at a different part of the model which represents our world. For instance:


  • Prediction and compression are about how you acquire the model.

  • Abstraction and causality are about how you represent it.

  • Search and rationality are about how you use it.

  • Adaptation, learning and self-improvement are about how it changes over time.

  • Embodiment and sociality are about where the model lives and what it's for.

  • Meta-cognition and defensibility are about how you stand behind what the model produces.

  • Goal achievement is the measure of whether any of it worked.


Seen this way, the definitions aren't a dozen different intelligences. They're facets of a broader capability. Any holistically intelligent system would need to do all of them. A partial system can be extraordinarily strong on some facets. And yet, as we’ve noted with LLMs, completely miss the mark on others.


Where LLMs sit on the map


What happens when we try to assess, in strengths and weaknesses, where current frontier models actually score.


Strong: prediction - but only of one specific thing. This is the facet that needs the most careful unpacking, because the idea that ‘LLMs are great at prediction’ hides a much narrower truth.


What LLMs are great at predicting is the next token in a sequence of human-written text. And that shouldn’t be a surprise because it's what they were trained on. 


They do not learn from direct experience of the world; they learn from linguistic traces of how humans, who have experienced it, then go on to represent that experience. And, it turns out, this is a far more powerful objective than almost anyone expected. Why? Because so much of human experience is encoded in language. 


But it does not mean LLMs are great at prediction in general. They are not great at predicting the weather, the three-dimensional structure of a protein, the response of a financial market, or the trajectory of a thrown ball. They can talk, fluently, about those things, because people have written about them. But they cannot do the prediction itself. 


Other systems do - AlphaFold for proteins, numerical weather models for the atmosphere, JEPA-style world models for physical dynamics. Each was trained, specifically, to predict the thing it predicts.


This tells us that ‘prediction’ isn't a single capability the field is gradually unlocking. It is a family of capabilities, each tied to the data a system was trained to build. If language is a cultural compression of human world-experience, the open question becomes how much world-modelling can be learned indirectly. And at what point direct experience becomes necessary. The truth is that we don't yet know where that ceiling sits - but we can already see there is a ceiling. If you want a system that predicts physics, you have to train it on physics - and not just on text describing physics. If you want a system that predicts how a cell will respond to a drug, you have to train it on that. The intelligence is in the representation, and the representation comes from what the system was trained to predict.


Strong: compression - of what was in the training set. This is a necessary skill of any mode being good at next-token prediction. To compress text, you need an internal representation that captures regularities - syntactic, semantic, factual, stylistic. Modern LLMs have, in some meaningful sense, compressed an enormous chunk of written human knowledge into a few hundred billion parameters. The effectiveness of this technique is why so many people have gravitated towards the compression-is-intelligence idea. The caveat is the same as for prediction: the compression is over the training distribution. A model trained on the internet has not compressed the genome, the weather, or the dynamics of a manufacturing line. It has compressed what people have said about human experience and knowledge.


Strong: abstraction, with caveats. LLMs manipulate concepts. They can analogise, classify, generalise, and operate at multiple levels of description. But here it's worth invoking Alfred Korzybski's old observation: the map is not the territory. Language is a remarkable map of the world - perhaps the most compressed, richest, most reusable map our species has ever produced - but it remains a map. It describes the territory; it isn't the territory itself. LLMs have inherited our extraordinary atlas, and they can navigate it with astonishing fluency. What they cannot do is step outside it. Their abstractions are inherited from the descriptions humans have written, not constructed from contact with the world those descriptions point to. So they work brilliantly where the map is accurate and well-drawn, and they fail in the places where the map is silent, distorted or simply wrong. Which is exactly what we see in practice.


Moderate: search. Modern systems with chain-of-thought, tool use, and agentic scaffolding do search - they explore solution spaces and evaluate what they find. But they do it slowly, at high cost, and without the deep structural advantages that classical search algorithms bring to bear on well-defined problems. A chess engine still beats a language model at chess by orders of magnitude in efficiency.


Moderate: rationality. They can produce rational arguments, weigh evidence and apply Bayesian reasoning when prompted. They can also be talked out of any of it. Coherence under pressure is uneven. Rationality is present but often fails to hold.


Weak: meta-cognition and defensibility. This is perhaps the most consequential weakness for commercial use, and the one that most clearly separates a frontier model from a human subject matter expert. LLMs are notoriously poorly calibrated about what they know and what they don't. They will confidently produce an answer, a citation or a justification - even when the honest response would be: ‘I don't know’. They will even produce convincing-sounding reasoning that does not actually defend the conclusion it appears to defend. A human expert, by contrast, knows the edges of their own knowledge - they can tell you when a question is outside the evidence, when a conclusion is shaky and why, when they'd want a second opinion, and when they'd stake their reputation on the answer. Almost everything we build around LLMs today - retrieval augmentation, citation pipelines, verifier models, adversarial checking, evaluator chains - is compensation for the model's lack of this facet at its core. We are, in effect, trying to build externally what an expert does internally. It works, up to a point. But it's expensive, brittle and is never quite as reliable as the real thing.


Weak: adaptation and learning. This is one of the sharpest gaps. Once trained, a language model's weights are frozen. It cannot, in any meaningful sense, learn from the conversation it's having with you. Context windows give it a working memory, but not a way to update its understanding. The contrast with biological learning, where every experience modifies the system in some small way, is stark.


Weak: causal modelling. LLMs are extraordinarily good at correlation in language space. They can describe causal reasoning fluently. But when probed on whether they can reliably distinguish ‘X causes Y’ from ‘X is correlated with Y in the training data’, results are mixed at best. They lack the structural machinery for counterfactuality.


Weak: goal achievement across environments. LLMs achieve goals in one environment - that of language, as represented by text. Extending that to physical environments, long-horizon plans or novel domains requires elaborate scaffolding that does most of the work. Because the model can't do it on its own.


Largely absent: embodiment. They have no body, no proprioception, no sensorimotor loop. Whether this matters depends on whether you think embodied cognition is essential or contingent. There are serious researchers on both sides, but you might like to note that every biological intelligence we know of is embodied. None of them solved cognition by being a disembodied next-token predictor.


Largely absent: social coordination in any meaningful sense. They can simulate it. They can pass a Turing test in a single conversation. They cannot maintain a relationship, build trust over time, or participate in a culture in the way humans do - because they have no persistent self. There is no entity with which to have an actual social relationship across time.


Largely absent: self-improvement. They cannot modify their own weights, examine their own reasoning or improve their own algorithms. Each conversation starts from the same place. Whatever insight emerges in one session is lost in the next.

The above analysis exposes the jagged intelligence profile of LLMs. They are systems that are extraordinarily strong on a specific cluster of facets - prediction, compression and abstraction. But conspicuously weak on many others.


Why the jaggedness is structural


It would be easy to read conclude the weak areas are just engineering problems that will close over time. Some of them probably will. Others, we believe, are structural - they are baked in by the architecture. And closing them requires different architectures, not just bigger ones.


Three structural facts about transformer-based LLMs are worth keeping in mind.


  1. They are static after training. The learning happens elsewhere, on a different timescale, with different machinery. This is unlike any biological intelligence. And it's the root cause of the adaptation and learning weakness.

  2. They are trained to imitate the distribution of human-generated text. LLMs are not taught how to act in the world. Everything they ‘know’ about causality, embodiment, planning or social relationships is filtered through descriptions of those things in language. That's a remarkable amount of leverage, but it's also a fundamental limitation. They only learn what people say about causality. Which is a partial and often misleading guide to causality itself.

  3. They have no persistent state across interactions. Unless we bolt it on externally - using memory systems, vector databases or other agent frameworks and harnesses - the models themselves are single-session beasts. And the extensions are scaffolding around the model, not properties of it. They can compensate for the gap, but they don't close it.


These three facts together explain most of the jaggedness. They also explain why scaling, while it has produced extraordinary gains, doesn't close every gap. Some gaps are not about scale. And this leads to what may be the most important consequence: the missing facets cannot be bolted on. They have to be trained in.


This is exactly the baseline understanding that, in the rush to AGI and superintelligence, is most often glossed over. Debate and predictions of future states often continue as if adding causal reasoning - or planning or learning-from-experience - to an LLM were features that could be appended. 


But each of those capabilities depends on the system having an internal representation suited to that capability. Representations which can come from the model architecture and training regime. A system that has been trained to predict text tokens has a representation suited to predicting text tokens. If you want a system that predicts physical dynamics, you have to train it on physical dynamics. The resulting representation will look different. If you want a system that learns from its own experience, you have to train it in an environment where its actions have consequences it can learn from. And the resulting representation will, again, look different.


You can compose trained systems together, and people increasingly do. But you cannot retrofit a capability the model was never trained to develop. The intelligence is in the weights, and the weights were shaped by the objective.


A reasonable objection at this point is that the picture is changing - that what we now call ‘AI’ in commercial settings is rarely just a model. It's a system. The model sits inside a wrapper of tool use, retrieval, memory layers, agentic loops and human oversight. And that wrapper compensates for many of the gaps. 


To develop that argument: A base model may lack persistence, but the deployed product can have memory. A model may not act, but the system around it can act through tools. A model may not learn online, but the surrounding product can update retrieval stores, prompt templates, user profiles and fine-tuning loops. And all of this is all genuinely true. But there is a more subtle point worth making here. In addition to the types of intelligence, there are three levels of intelligence we should bear in mind:


  1. Model intelligence The capability held in the trained model's weights alone: what it can do as a static artefact, before tools, memory or scaffolding are bolted on.

  2. System intelligence The capability of the deployed product, where the model is wrapped in tool use, retrieval, memory and agentic loops that compensate for gaps it has on its own.

  3. Socio-technical intelligence The capability that emerges when the technical system and the people around it are optimised together: model and human experts fitted to each other and to the culture, norms and tacit knowledge of the work.


Each represents a meaningfully different scale at which intelligence emerges. Brightbeam’s ultimate mission is to furnish the world with better socio-technical intelligence, on the basis of what’s available at the model and system intelligence levels.


Which all goes to show, therefore, that you can introduce different forms of intelligence at these different scales. And if one form of intelligence does not exist at the scale of the model, providing it appears in the system scale, all is well. Right? Perhaps. But real world experience keeps demonstrating that the capabilities trained into a model are reliably more effective than those engineered around one. 


AlphaGo learned from a vast corpus of human games and beat the world champion. AlphaZero was trained from a clean slate with no human games at all - and beat AlphaGo decisively in three days. Early systems bolted tool use onto language models via prompting; modern frontier models are trained to use tools natively, and the difference in reliability is substantial. Trust us on that one, everyone at Brightbeam benefits from it every working minute of every working day. 


The lesson seems to be that scaffolding works as a transitional strategy - and an important one, often the right thing to ship today. But the durable advantage comes from training the capabilities in. Which suggests the structural gaps in the model matter precisely because they shape, over the long run, what the surrounding system can sustainably do. Scaffolding is provisional. Trained-in capability is durable.


What might be coming next


If the missing facets have to be trained in rather than bolted on, the interesting questions become: Who is training what? And against which objectives?


Recently the field has split. The frontier labs are doubling down on LLMs and the

agentic systems built on top of them. Much of the original research community - including, often, the people who invented the modern deep-learning paradigm - has moved on, focusing on the facets where LLMs are conspicuously weak. 


Several serious efforts are visible, none yet producing systems with the breadth and fluency of frontier LLMs. The next decade of AI will be shaped by how these threads develop and combine.


Yann LeCun, Meta's former Chief AI Scientist and a Turing Award winner for his foundational work on deep learning - now founder of Advanced Machine Intelligence Labs - has spent years arguing that next-token prediction over text is the wrong objective for building generally intelligent systems. 


His Joint Embedding Predictive Architecture programme - JEPA - trains models to predict abstract representations. The goal is to train in an intuitive understanding of physics from observation. This is ‘world modelling’: prediction of the actual world, rather than the text which describes it.


David Silver, the architect of AlphaGo and AlphaZero, recently left DeepMind to found Ineffable Intelligence. His thesis, the ‘Era of Experience’, is that systems which learn from their own experience via reinforcement learning - from a clean base, without human-generated training data - can develop capabilities that text-trained systems cannot. He's after adaptation, self-improvement and the discovery of strategies humans don't already know.


Ilya Sutskever, OpenAI's former chief scientist and one of the most cited researchers in modern deep learning, founded Safe Superintelligence Inc. after leaving OpenAI in 2024. His public position, articulated in interviews over the last year, is that we are moving from the ‘age of scaling’ to the ‘age of research’ - that the next breakthroughs will come from solving the problem of generalisation, particularly through systems that learn continually on the job rather than being trained once and frozen. SSI is, by his own account, looking for a new paradigm rather than scaling the existing one.


Karl Friston, the neuroscientist behind the free energy principle and one of the most influential figures in theoretical neuroscience, comes at it from a different direction again. His active inference programme treats intelligence as the minimisation of uncertainty about the world, with systems that build internal generative models, act to reduce uncertainty - and adapt continuously. 


The framing is biological, with the contrast to statistical pattern-matching over a frozen training set explicit.


There are others - causal modelling in the Pearl tradition, brain-inspired architectures of various kinds, embodied robotics, symbolic-neural hybrids. The point isn't to handicap any one of them. The point is that they exist in parallel, each targeting different parts of the intelligence map, each making a different bet about which facet of intelligence is the unlock.


What's striking about this taken as a whole isn't that any one is obviously the answer. It's that they aren't really rivals. LeCun's world models, Silver's experience-driven learners, Sutskever's continual generalists and Friston's active inference agents could, in principle, end up as components of the same eventual system - each contributing the facet it was designed for. 


We don't know yet whether they will, or which will turn out to be foundational and which will turn out to be dead ends. We don't know whether LLMs themselves will be central, peripheral, or - in the longest view - a stepping stone that historians of AI will eventually study as a remarkable but transitional artefact. The capabilities of current LLMs are too useful for them to be irrelevant. Whether they are still load-bearing in twenty years is genuinely an open question.


Several of these people are explicit, though, about where they think this is going. Sutskever named his company Safe Superintelligence. Silver's stated mission is to build a "superlearner". Demis Hassabis has long framed DeepMind's quest in similar terms, and now talks about AGI as a system that can exhibit all the cognitive capabilities humans can. The funding flowing into these efforts dwarfs anything the field has seen. And the public conversation has moved on from whether superintelligence is possible to when it arrives.


Not that it isn’t without its internal contradictions. The original framing, popularised by Nick Bostrom, treats superintelligence as a system that exceeds human capability at virtually every cognitive task - a definition that assumes intelligence is one thing and superintelligence is more of it. The facet view doesn't fit that frame. Being ‘more intelligent’ is no more a single thing than being ‘as intelligent’ is in the first place. It depends on which facets, in which environment, for which task.


What that lets us see is something the headlines tend to miss: superhuman intelligence already exists, and has for years, on specific facets. 


AlphaFold predicts protein structure superhumanly. AlphaGo and its successors play Go superhumanly. Numerical weather models forecast the atmosphere superhumanly. Hassabis himself has used the word superhuman for these systems - and is candid that AlphaFold, despite winning a Nobel Prize, is in the bigger picture ‘hopelessly narrow’. They have built better internal models, for one domain, than any human can, and they act on those models with results no human can match. Under the definition this essay has been working with, that is intelligence - and on the facets they cover, it is already past us. As is, we could argue, a pocket calculator from the 1970s. All compute outperforms humans in given specialised tasks. What we don’t have yet is a ‘generalised’ intelligence covering all relevant facets.


Which is why the next move, which several labs are working on, is to compose facet-strong components into systems that exceed human capability across more axes at once. Hassabis is explicit that hybrid or neuro-symbolic architectures - the kind that produced AlphaGo and AlphaFold - are how the next breakthroughs are likely to come. None of this requires solving the whole map. It requires integrating enough of it, for the right kind of problem, in the right environment. That is closer than the public debate often suggests, and it is the direction most of the serious money is moving.


Which is the point that matters most for anyone deploying AI today. The destination people imagine when they say superintelligence - a system superhuman on every facet, including continual learning, causal reasoning, meta-cognition and embodied judgement - is still a long way off. 


We don't yet have working architectures for several of those facets. Hassabis, Sutskever and Silver have each said as much in their own language. 


But the destination people should be preparing for, because it is much closer, is contextual: systems that are genuinely superhuman at many particular tasks in many particular environments, assembled from components each trained for what they do, integrated alongside the humans whose tacit and procedural knowledge the system depends on. 


That isn't a distant prospect. That is the work of the next few years.


What this means in practice


The implication for anyone deploying AI today is straightforward. We all need to map our problems onto the intelligence facets - before we map it onto a solution. 

Knowing which intelligence we need is half the battle.


If your problem is dominated by prediction, compression or fluent abstraction from human-written material - including language understanding, summarisation, code completion, classification and drafting - LLMs are extraordinary. Use them confidently.


If your problem requires causal reasoning, intervention, or counterfactual thinking, treat the LLM as a hypothesis generator, not a provider of conclusions. Pair it with explicit causal modelling, simulation - or simply the right expert human judgement.


If your problem requires learning from experience, the model won't do it for you. Retrieval augmentation, fine-tuning pipelines, and memory systems are prosthetics for an absent capability - useful, but not the same thing.


If your problem requires long-horizon planning, persistent goal or coordination across many actors, you need agentic scaffolding designed carefully. The model is one component, not the system.


And if your problem requires the answer to be defensible - auditable, traceable, calibrated and defendable under challenge - the model is the starting point, not the deliverable. You will need a verification layer that separates generation from defence: external evidence, adversarial checking, evaluator passes and the discipline to say: ‘I don't know whether the question is outside the evidence’. 


The senior human expert does this internally. With an LLM, today, you have to do it externally. 


None of this is news to experienced practitioners. But it's worth saying explicitly - because the public conversation around AI tends to flatten all of these axes into a single dimension of how ‘good’ the model is. Which can make deployment decisions much harder than they need to be.


The problems waiting to be solved


If you want a sense of where the gap between what today's AI can actually deliver - and what people really need - the evidence is all around us. Ask at the expert - the senior engineer who knows what a process should sound like when it's running well - how current LLMs are performing. Or the regulatory specialist who has internalised a thicket of rules into something that feels like judgement. or the operator on the shop floor who notices the thing nobody wrote down. 


Human expertise is full of tacit knowledge - what the philosopher and chemist Michael Polanyi famously described as the things we know but cannot fully tell. It is full of procedural knowledge that lives in action rather than in explanation. It is, by definition, the knowledge that didn't get written down clearly enough for an LLM to learn from text. It mostly goes uncaptured and remains reasonably informal in every organisation.


This is where the gap between current AI - and Enterprise Data Strategies - and useful intelligence shows up in commercial reality. 


Not in benchmark scores. In the messy, partially-articulated, situationally-dependent know-how of people who have done a thing for twenty years and can do it well in conditions nobody can quite describe. That kind of intelligence is built through experience, through embodied practice, through correction by people who know better, through the slow accumulation of pattern recognition. It is the kind of intelligence that the facets in our map - adaptation, embodiment, causal reasoning and learning from experience - are precisely about.


For anyone trying to deploy AI to support, augment, or eventually replicate that kind of expertise, the implications are direct. An LLM that has read everything ever written about welding is not a welder. It will struggle to explain, in any system you build around it, the parts of welding that are genuinely tacit. To get further, you need either the LLM in collaboration with the human expert - letting the model handle what it's good at while the human supplies the tacit and procedural layers - or you need different kinds of systems trained against different objectives, of the kind LeCun, Silver, Sutskever, Friston and others are pursuing. 


This is, incidentally, why Brightbeam’s role as the integrator of digital intelligence becomes more important, not less, as the underlying technology matures. The future isn't one model that does everything. It's many specialised capabilities, each trained to do what it does, composed into systems that serve real human work - often alongside human experts whose tacit knowledge is, for now and probably for a long time yet, irreplaceable. 


The work of integration is figuring out the right type of intelligence for the right task at the right level of expertise, in the right environment, with the right level of human involvement - and getting all of that to fit the culture, norms and lived experience of the people who will actually use it. Choosing wisely between the components, and assembling them into something coherent - that respects both the strengths of the AI and the depth of human expertise - is itself a form of intelligence. 


And one that humans, for now, remain uniquely good at.


A definition worth holding


If we were to commit to a single working definition of intelligence after all this, it would be the one the literature keeps converging on:


‘Intelligence is the capacity to build internal models of the world that support successful action.’


That definition has the virtue of being capacious enough to accommodate all the facets, while being specific enough to be useful. It tells you that compression alone isn't enough, because compression without action is just storage. It tells you that prediction alone isn't enough, because prediction without intervention is just observation. It tells you that goal achievement alone isn't enough, because goals without good internal models lead to brittle behaviour.


It also holds across the full range of things we call intelligent. A single-celled organism navigating a chemical gradient is, in its own modest way, doing this. It has some internal representation of where the food is and where it isn't, and that representation supports successful action. It isn't social, it isn't reflective, it doesn't reason, it has nothing like a language. But it has a model and it acts on it. 


At the other end, a senior expert with decades of accumulated judgement is doing the same thing at vastly greater depth, across many more facets, with meta-cognition and defensibility and social coordination layered on top. 


So we can confidently conclude that the definition spans both ends of the intelligence spectrum, which is exactly what a good definition of intelligence should do. Anything narrower than this risks defining intelligence as human-style intelligence, which is a particular case, not the whole thing.


And the definition helps you, when you're looking at any system - biological, artificial, individual, collective - to ask the right questions. What models can it build? What was it trained to predict? What actions can it take? How does it learn when it's wrong? Can it say I don't know?


LLMs deliver certain outputs, such as computer code, extraordinarily well. They are not the whole of intelligence, but they are unmistakably a part of it, and the part they occupy is genuinely valuable. The work ahead - both technical and commercial - is to understand which part, to recognise honestly the parts they cannot reach, and to build the rest of the system around them, alongside them, or in some cases instead of them, with eyes open.


That's a much more interesting project than waiting for one model to do everything. And it's a much better foundation for the actual work of putting digital intelligence to use in service of the humans it's meant to help.


 
 
BB White and Orange.png
Get in touch bubble roll.png
Get in touch bubble.png
Button overlay.jpg

Home

Further reading

Careers

Contact us

BB White and Orange.png
bottom of page