Why isn’t modern AI built around principles from cognitive science?
First post in a series on cognitive science and AI
I get asked a lot of questions about the relationship between AI & Cognitive Science, especially from early-career researchers wondering where their work might fit into the rapidly evolving fields. This is the first in a series of posts where I aim to lay out my current thoughts on the relationship between these fields (and the career options in them) in the present research environment. Everything here is my personal opinion; almost any claim I make some researchers would stridently disagree with. So take it with a grain of salt.
In particular, I want to lay out a few areas where I find cognitive science to provide consistently useful perspectives, both in the research I do and in the field more broadly. I also want to contrast that with some of the directions that I think are often pursued, but that I personally believe may be less promising. This post begins by discussing some history, and my perspective on why current AI models are not mainly designed around principles from cognitive science or neuroscience. In my next few posts, I’ll take a more optimistic perspective, on where I find that my research in AI (and that of others) has gained a great deal from cognitive science.
A very brief history of AI & Cognitive Science
Historically, cognitive science and AI were tightly coupled fields, with insights in one quickly driving progress in the other. More recently, things have seemed more unequal. There has been an explosion of recent progress in AI. This progress has been driven by increased computational power, growing datasets, architectural innovations driven by machine learning problems rather than cognitive inspirations, and practical development of effective machine-learning frameworks with tools like automatic differentiation that make model development easy.
At least at a glance, most of these developments in AI do not seem to be driven by cognitive science. Instead, the recent flow has mostly been the other direction. There has been a torrent of observations that AI vision models predict activity in visual cortex in animals (or humans), and even aspects of its spatial organization; likewise, language models can predict human imaging data remarkably well, and reproduce known behavioral phenomena. Given these developments, cognitive scientists have suggested that AI progress refutes one of the dominant linguistic paradigms, suggests new perspectives on the constraints faced by the brain, and changes how cognitive scientists should approach research and theory-building. I’ll return to these themes of what AI might offer to cognitive science in another post. For now, though, I want to focus on the opposite direction: what cognitive science can offer to AI.
Why doesn’t modern AI build on what we understand about the brain more often?
We know a lot about the brain across many levels of analysis. We can describe the detailed biochemistry of neurons; we can identify how particular brain regions focus on different aspects of vision, language, memory, or cognitive control; we know how various aspects of intelligence develop from birth through adulthood; and we can explain computational principles that underlie aspects of subtle behaviors like pragmatic language inferences. Yet almost none of these insights has contributed anything major to AI. Why?
A snide answer might be that AI researchers just don’t know or care about cognitive science, but ignore it to their detriment. There may be some truth to this, but clearly it isn’t the whole picture — if for no other reason than that there are a *ton* of cognitive (neuro)scientists working in AI. The cofounders of DeepMind met in a computational neuroscience program. During my time in industry, there have easily been more people with PhDs in cognitive science or neuroscience working at DeepMind than the number of faculty at any university cognitive or neuroscience department in the world. There have been whole teams of such researchers. And that’s only in a single organization.
There have been plenty of papers over the years arguing that AI is missing core ingredients of human intelligence, and there are often works published in machine learning conferences that explicitly draw on such inspiration to propose new perspectives. For my own part, I’ve also spent plenty of time trying to engineer systems that incorporate principles of natural intelligence. I think there are plenty of things to be learned from such works.
However, in my works where I’ve drawn on explicit ideas from of cognitive science in designing architectures (and I think this often applies to the field more generally), I’ve often done so at a cartoonish level of abstraction that could in many cases be derived from first principles without needing a cognitive motivation (e.g., that a system should be adaptable to novel tasks based on their relationship to known ones, or that it might be useful to recall aspects of the past individually in detail). Even so, my works that build architectures around ideas from natural intelligence haven’t seemed to me to be the most successful or impactful things I’ve done.
And more generally, I’d argue that the current dominant paradigm in AI has not primarily grown out of approaches that designs systems on the basis of cognitive science.
The bitter lesson
(This section will probably not be surprising to AI researchers, but I think may be useful context for those coming from cognitive science who are less familiar with the theme.)
My own journey in my research has often reflected aspects of the bitter lesson that Rich Sutton (an RL pioneer) articulated about the spirit of modern machine learning vis-à-vis our own knowledge of thinking (emphasis mine):
“We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.”
Indeed, my most successful works have focused on ways of enriching training to help systems to learn more effectively, or understanding what it is that systems learn. I’ll discuss in a subsequent post how cognitive science can be (and has been) useful in those areas. But for now, I want to focus on one of the implications that I think AI has absorbed from this lesson — and why I think it is important for cognitive (neuro)science researchers who want to contribute to AI to understand.
I think that AI has been driven by the bitter lesson to be a strongly empirical field. In particular, it has learned to focus on learning-based methods that scale to tackle large-scale real world problems.
Of course, AI researchers (such as Sutton) are often interested in theoretical arguments and toy demonstrations in simplified settings. These can be the seed of major changes in the field. However, making a principled argument that some principles of neural anatomy or computation would improve AI will rarely shift the focus of the field without a scaled-up demonstration: not just a toy model on a toy task. As Sutton highlighted, building in knowledge always helps in small-scale experiments — particularly if those experiments are designed precisely to demonstrate the benefits of the principle in question — but in the long run building in knowledge often holds models (and research) back. It’s important to demonstrate that an idea works beyond a narrow, simplified setting.
However, in my experience cognitive (neuro)science has focused most on studying intelligence within small sets of relatively simplified tasks.
Minimalistic task design
Indeed, it is often a deliberate principle of experiment design in cognitive (neuro)science to focus on as minimalistic an instantiation of a particular challenge as you possibly can. Rather than studying planning in how someone organizes an event (slow, messy, difficult to control) we might study it in how someone navigates around a simple grid environment (fast, very few variables, controlled). Of course, this is a very reasonable choice. There are many benefits to studying in minimalistic settings: it makes it easier to ensure that there are no other variables at play, to understand the experiments, to analyze the data, etc. The field has made these choices for a reason.
But in my view there is a major obstacle to achieving full understanding via this approach. It means we often build tasks that test only the capability we’re interested in, in settings where that capability is the only perfect solution. That means we don’t really understand how or when that capability might get invoked more generally, or whether that capability is even the right way to describe the larger picture of what the brain does.
To make that more concrete, imagine that we were testing a perfectly adaptive intelligent system. When we gave the system a set of math problems it would respond just like a calculator would, and we might decide that it has a calculator like component. When we gave it a navigation problem, perhaps it would take longer to plan longer paths (or in settings with higher branching factor) but still succeed, and we might decide it is using some kind of search. If we tested it on a set of linear regression problems, we might decide that it was using least-squares. An adaptive system is like a mirror that reflects the structure we put in front of it; if we engineer a test around some simple structure, we’ll see that structure reflected back out. But are any of these the right description for what the system is doing? Without pushing into tasks that step outside of these simplified isolated paradigms, it’s unclear.
Because of this, we often build up deep understanding of isolated islands in the full space of natural intelligence — the way the brain works in specific regimes like navigating a grid, perceiving gabor patches, or learning about the (drifting) value of risky gambles — and to build up our bigger-picture understanding by extrapolating from these relatively few points in the larger scope of intelligence.
Understanding natural intelligence is hard
I don’t want this to sound too critical of cognitive science and neuroscience. Researchers in those fields are doing amazing work on understanding a very complex and difficult system that changes as you interact with it. That’s a very difficult job, and it only makes sense to try to restrict the scope of the problem. Despite the challenges, there are many scientists who are pushing the boundaries by testing out ever richer and more naturalistic paradigms, studying more complex interactions between different neural systems across settings, and covering broader swaths of task space. There are plenty of researchers arguing that current paradigms are inadequate for achieving full understanding of the brain, e.g., Nicole Rust’s great new book Elusive Cures: Why Neuroscience Hasn’t Solved Brain Disorders—and How We Can Change That. There have even been long-standing efforts to build cognitive architectures that more completely describe how the pieces fit together and allow simulating the full of scope of human cognition. But so far, the progress made in these efforts has seemed to me to not yet have the scope needed to really answer the practical challenges of AI.
Fragmentary, abstract understanding meets messy real-world challenges
Thus, I think that the fundamental reason that cognitive science has not contributed more to engineering modern AI systems is that our current understanding of cognition is often too fragmentary or abstract. There’s a lot we understand about the brain and mind in certain contexts, but we don’t understand how the pieces fit together to form the whole of adaptive intelligence at the level of detail that could actually help in practical AI development.
When I think about challenges of AI, I think they tend to occur at the messy places where many constraints intersect. For example, models may struggle to appropriately integrate visual and linguistic information on challenging problems, or may get confused when a domain uses different syntactic structures than usual, or when a problem seems like a classic brainteaser, but isn’t. These challenges seem to me to come from a more holistic failure of the many types of knowledge in the system to work together in just the right way to solve the task.
Of course, that may sound like a fuzzy and ambiguous statement — what couldn’t be interpreted as a failure of the system to work in “just the right way?” But that’s precisely my point: I think that the kinds of issues that AI systems often face are the messy problems involving how to appropriately represent and integrate many different types of information across many different types of tasks with very different structures, with as much generalization and as little interference as possible between tasks. Areas where many interacting factors intersect are necessarily hard to understand in simple terms. And because of that, these are precisely the areas where I think we understand how the brain actually works fairly poorly; certainly not well enough to apply it to building better AI.
A new hope
This might all sound a bit pessimistic. But in truth, I think it’s a great time for cognitive science to both contribute to, and learn from, AI. I think the problems the fields face are increasingly overlapping—as each tries to make sense of complex adaptive systems—and each field can learn from the methods, practices, and principles of the other. I’ll describe those directions in my next few posts.
Thanks to Wilka Carvalho for helpful feedback and comments on this post.



Great read - looking forward to the rest of the series!
This reminded me of this recent-ish essay: https://gershmanlab.com/pubs/NeuroAI_critique.pdf. The core argument here (as I understand it) is that the very notion of biological inspiration for AI is somewhat ill-defined, because characterizing biological principles in the first place requires coming to the problem with a computational framework to make sense of the data: it’s not like the principles are just “there”. The proposal here is that rather than attempting to use neural/cognitive plausibility as a source of design principles, we can instead use it as a tiebreaker between candidate algorithms, under the assumption that algorithms that are actually implemented by biological systems are generally better.
As a separate but related note: even if we did have detailed, integrative, computational theories of complex cognitive phenomena, it’s not obvious that we should strive to integrate those insights into AI systems, because the constraints imposed on biological intelligence and AI systems are often quite different. For example, if a particular cognitive phenomenon is a byproduct of the fact that we humans have limited working memory, it’s not clear why we would try to bake that into AI, which is not bound by those same limitations. For some domains, the core problem being solved by both humans and AI may be similar enough that the solutions may be transferrable (and that doing so could be beneficial), but for others this seems less obvious.
I think an important problem with the field of NeuroAI is that strong claims are too often made based on extremely weak evidence. For instance, you write: " language models can predict human imaging data remarkably well", but it is important to be clear what the authors you cite found. They reported models could account for near 100% of explainable variance, but the explainable variance was very very low (in most cases between 4-10%), and that a similar amount of variance was observed in non-language areas, raising questions about how to interpret the findings. And you write cognitive scientists have suggested that AI progress refutes one of the dominant linguistic paradigms. Again, that is true, but the evidence for this claim is extremely weak, as reviewed here. https://psycnet.apa.org/record/2026-83323-001. Do you think the papers you cite make a strong cases for ANN-human alignment, and challenge the role of innate priors? I think the field could use a bit more scepticism.