Do AI Models Actually Learn?

There's a word the AI industry uses constantly : learning.

We say models "learn" from data. We call it "machine learning." We talk about what the model "learned" during training. The language feels natural because it borrows from human experience. But here's the uncomfortable question: is anything actually being learned? Or are we using a human word to describe something fundamentally different?

This isn't pedantry. The distinction shapes how we trust, deploy, and regulate these systems.

The Machinery Is Math, Not Discovery

When a neural network trains, it runs inference through a mathematical architecture—attention mechanisms, matrix multiplications, activation functions. This architecture wasn't learned. It was designed by humans. The logic of how to process information is baked in from the first line of code.

What changes during training are the weights—billions of numerical parameters adjusted to minimise prediction errors. But those weights are essentially compressing patterns from data that already contains human reasoning.

Consider: an LLM trained on the internet didn't independently discover that Paris is in France. It absorbed that association from text written by humans who learned it through experience , literature and visual cues. But Paris for Humans is not a vector embedding. So while the reasoning structures in training data came from human minds, the model compressed and redistributed inference that was already done.

Recent research from Wang et al. at UCSB examined this directly, tracing LLM capabilities back to their pre-training data. Their findings? Factual question answering shows the strongest memorisation effect, while tasks like translation and mathematical reasoning show greater generaliwation—producing outputs that differ more from the training distribution. The implication: what looks like "learning" in knowledge-intensive tasks is often sophisticated pattern matching against what the model has seen.

So what we call "learning" is really: human-designed math + compressed human knowledge + statistical optimisation.

That's not nothing. But it's not what we mean when we say a child learns to speak or a scientist learns how the universe works.

The Missing Ingredient: A Learner

Here's what separates biological learning from machine training: consciousness.

When a child struggles with a puzzle, there's frustration. When they solve it, there's satisfaction. The learning is embedded in lived experience. Someone is there, experiencing the process of understanding.

Models have no experiencer. When weights update, nothing feels itself getting better. There's no "aha" moment. No felt sense of confusion resolving into clarity. The math changes. That's it.

Philosopher John Searle made this point forty-five years ago with his Chinese Room thought experiment. The argument holds that a computer executing a program cannot have understanding or consciousness, regardless of how intelligently or human-like the program may behave. A person in a room following rules to manipulate Chinese symbols can produce perfect responses without understanding a word. The processing is happening; the understanding isn't.

As Searle put it: genuine thought and understanding require something more than mere computation. In understanding a language we do not merely manipulate symbols based on their formal properties—we do something in addition in virtue of which we actually understand the meaning of the symbols.

Learning without a learner isn't learning. It's transformation. It's computation. Calling it "learning" smuggles in an assumption that there's someone home.

Why This Matters Beyond Philosophy

This isn't academic nitpicking. The language shapes how we build and deploy these systems.

If models "learn," we might trust them to reason in novel situations. If models compress and interpolate existing human reasoning, we'd be more cautious about where that reasoning breaks down.

Research on grokking—where models suddenly shift from memorising training data to generalising—shows the boundary is real but fragile. Models can suddenly flip from memorising their training data to correctly generalising on unseen inputs after training for much longer, a phenomenon that has sparked interest in understanding when AI is parroting memorised information versus using richer models.

Blurring this line has its benefits. "The model learned" sounds more impressive than "we optimised a statistical function over human-generated data." The anthropomorphism isn't accidental—it's marketing that shapes expectations, investment, and regulation.

The Honest Framing

We've definitely built extraordinarily sophisticated mirrors. They reflect human reasoning back at us with remarkable fidelity. They can recombine patterns in ways that feel creative, sometimes even surprising.

But mirrors don't learn. They reflect.

Until there's a subject doing the learning—an experiencer, a consciousness, a life being lived—calling it "learning" is a metaphor we've mistaken for a fact.

Models compute. Living beings learn.

The distinction matters more than the industry wants to admit. And the sooner we're honest about it, the better we'll be at understanding what these systems can—and can't—actually do.

References

Wang, X. et al. (2024). "Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data." ICLR 2025. https://arxiv.org/abs/2407.14985
Searle, J. (1980). "Minds, Brains, and Programs." Behavioral and Brain Sciences. https://plato.stanford.edu/entries/chinese-room/
Google PAIR. "Do Machine Learning Models Memorize or Generalize?" https://pair.withgoogle.com/explorables/grokking/
Cole, D. (2023). "The Chinese Room Argument." Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/chinese-room/

Do AI Models Actually Learn?

The Machinery Is Math, Not Discovery

The Missing Ingredient: A Learner

Why This Matters Beyond Philosophy

The Honest Framing

References

Reply

Recommended for you

Subscribe for new reads…

Quick Links

Subscription

Socials