What ‘Learning’ Actually Means for a Neural Network

When a neural network learns something, it doesn’t store a fact the way a database stores a row. It adjusts the numerical weights on millions (or billions) of connections between artificial neurons. Feed the network enough examples of cats, and it doesn’t build a folder labeled ‘cat.’ Instead, the statistical patterns of cat-ness get distributed across the entire weight matrix, subtly encoded in the relationships between nodes. There’s no single place you can point to and say ‘that’s where it keeps cats.’

This is why training is fast in a way that feels almost offensive. A model like GPT-4 was trained on a corpus representing a significant fraction of human written output, and the training run took weeks on thousands of specialized chips running in parallel. A human reading at a comfortable pace would need tens of thousands of years to process equivalent text. The parallelism is the key: the model updates all its weights simultaneously from each batch of data, whereas a human brain learns sequentially, building on prior experience in real time, consolidating memories during sleep, pruning and strengthening connections over weeks.

The speed comes at a structural cost that most people don’t think about until it bites them.

Catastrophic Forgetting Is the Actual Term Researchers Use

The problem has a name: catastrophic forgetting. It was formally described by researchers Michael McCloskey and Neal Cohen in 1989, and the name is not an exaggeration. When you take a trained neural network and continue training it on new data, it doesn’t gently update its knowledge. The gradient descent process (the mathematical method that adjusts weights to reduce errors) treats the old weights as just numbers to be overwritten. The new training signal competes directly with the patterns that encode everything the model previously learned, and in most naive training setups, the new signal wins.

Imagine if every time you learned a new language, you forgot your native one. That’s roughly the situation. A model fine-tuned on medical literature might perform worse on general reasoning questions it could answer before. A model updated with data from 2024 might lose some of its 2020-era knowledge unless the training is done very carefully. The weights that encode old knowledge and new knowledge exist in the same shared space with no separation.

This is fundamentally different from how biological memory works. Human brains appear to use a process called complementary learning systems (the hippocampus handles rapid new learning, the neocortex handles slow consolidation) that largely prevents new experiences from immediately overwriting older ones. Evolution spent a long time solving catastrophic forgetting. Neural networks, as typically trained, haven’t solved it.

Diagram illustrating catastrophic forgetting, where new training data overwrites existing learned patterns in a neural network's weight matrix
Catastrophic forgetting isn't a metaphor researchers use loosely. New gradient updates compete directly with the weights that encode prior knowledge, and in naive fine-tuning, the new signal wins.

Why You Can’t Just Append New Knowledge

The intuitive fix seems obvious: keep the old weights frozen and just add new ones on top. This is called continual learning or lifelong learning, and researchers have been working on it seriously for years. A few approaches have made real progress.

Elastic Weight Consolidation (EWC), proposed by researchers at DeepMind in 2017, tries to identify which weights are most important for old tasks and penalizes changes to those weights during new training. Progressive Neural Networks dedicate entirely separate columns of neurons to new tasks while keeping old columns frozen and allowing lateral connections between them. These work, but they come with real costs: more parameters, more complexity, and the thorny question of how you decide which old knowledge matters enough to protect.

The approach most major AI labs use in practice is simpler and more brutal: when you want to update a model, you train a new one from scratch on a combined dataset that includes both old and new information. GPT-3 wasn’t updated to become GPT-4. GPT-4 was trained fresh. This is expensive (training large models costs tens of millions of dollars in compute) but it sidesteps catastrophic forgetting entirely by never asking old and new knowledge to coexist in the same training run.

The irony is that adding more training data can sometimes make models worse for related reasons. The composition and balance of a training dataset matters as much as its size, and naively mixing old and new data doesn’t guarantee the model will weight them appropriately.

The Human Comparison Is Instructive But Misleading in One Direction

It’s tempting to frame this as AI being inferior to human memory, but that framing misses what’s actually interesting. Human memory is not a gold standard. It’s reconstructive, error-prone, and subject to interference from similar memories. You misremember things your neural network counterpart would never misremember. You forget names, dates, and procedures constantly. The brain’s resistance to catastrophic forgetting comes bundled with a whole suite of other limitations.

What neural networks trade away is incremental updatability. What they gain is the ability to absorb patterns across enormous datasets simultaneously, with no fatigue, no attention lapses, and no sleep requirement. The architecture is genuinely suited to the problem of learning a fixed distribution of data very thoroughly. It’s poorly suited to the problem of continuously learning from a stream of new information without forgetting old information, which happens to be what real-world deployment usually requires.

This gap matters in practice. A customer service AI trained on your product documentation needs to be retrained or fine-tuned every time the product changes significantly. A medical AI needs retraining when treatment guidelines update. The retraining cycle is a real operational constraint, not a minor inconvenience.

The Gap Is Closing, But It’s Not Closed

Retrieval-augmented generation (RAG) sidesteps some of this by keeping a separate, updatable knowledge store that the model queries at inference time rather than baking all knowledge into the weights. Ask a RAG-based system a question and it searches a document database first, then uses the model’s reasoning capability to synthesize an answer from what it finds. The model’s weights stay static; the knowledge store gets updated independently. It’s a practical engineering workaround that works well for factual knowledge but doesn’t help with tasks that require the model to have truly internalized new patterns.

Adapter layers and low-rank adaptation (LoRA) techniques allow targeted fine-tuning that touches only a small subset of a model’s parameters, reducing the blast radius of catastrophic forgetting. These are genuinely useful advances. But they’re mitigations, not solutions to the underlying structural issue.

The honest position is that current AI systems are not general-purpose learning machines in the way the popular framing suggests. They are extremely capable pattern-matchers that learn extraordinarily well under specific controlled conditions, then get frozen. Updating them is less like a human learning something new and more like replacing a component. The speed of learning and the fragility of accumulated knowledge are two sides of the same architectural coin, and building systems that get both right remains one of the more interesting open problems in the field.