The term "neural network" conjures images of silicon mimicking gray matter, of machines learning the way children do. It is a seductive metaphor and a misleading one. While the mathematical structures underlying today's large language models and image generators do take loose inspiration from the architecture of biological neurons, the comparison flatters the technology far more than it illuminates it. Understanding where the analogy breaks down reveals why AI can translate a thousand languages yet struggle to understand that a marble cannot melt, why it can write sonnets but cannot want anything at all.
The biological inspiration, circa 1943
The foundational idea dates to a paper by Warren McCulloch and Walter Pitts, who proposed that simple computational units—artificial neurons—could be wired together to perform logical operations. Each unit receives inputs, applies a threshold, and fires or stays silent. The architecture mimicked, in the crudest sense, the all-or-nothing signaling of real neurons. By the late twentieth century, researchers had stacked these units into layers, added adjustable connection weights, and discovered that such networks could learn patterns from data by tuning those weights through repeated exposure. The brain, after all, also adjusts synaptic strengths through experience. The parallel seemed profound.
But biological neurons are not simple threshold gates. They integrate chemical signals across dendrites of staggering complexity, modulate their firing rates dynamically, and participate in feedback loops involving glia, blood flow, and neuromodulators that have no counterpart in a GPU. A single cortical neuron may connect to ten thousand others; a single artificial neuron in GPT-4 connects to a few thousand at most, and those connections are static between training runs. The brain learns continuously, in context, without catastrophic forgetting. Artificial networks, by contrast, require carefully curated datasets, immense compute, and architectural tricks to avoid overwriting prior knowledge when exposed to new information.
What the metaphor obscures
The deeper divergence lies in how these systems represent the world. Biological intelligence builds rich, embodied models: a child learns "hot" by touching a stove, "heavy" by lifting a stone, "dog" by seeing fur and hearing barks and feeling wet noses. These experiences are grounded in sensory reality and bound together by causal understanding. An artificial network, by contrast, learns statistical associations between tokens in text or pixels in images. It has no body, no sensory experience, no causal model. It knows that "ice" and "melt" co-occur in certain linguistic contexts, but it does not know that ice is a phase of water or that melting is an irreversible transition under standard conditions. It can produce fluent text about thermodynamics without possessing any mental model of temperature.
This is why AI can pass bar exams yet fail at common-sense reasoning tasks that a toddler handles effortlessly. The statistical patterns in legal text are dense and predictable; the implicit physics and social reasoning that underlie everyday life are not well-captured by word co-occurrence alone. The system has learned a high-dimensional map of language, not a map of reality. When the two align—when linguistic patterns reliably encode real-world structure—the AI appears intelligent. When they diverge, it produces confident nonsense.
Our take
The neural network metaphor has been commercially and scientifically productive, but it has also bred confusion about what these systems are and what they might become. They are not digital brains; they are sophisticated pattern-matching engines trained on human cultural output. Recognizing this does not diminish their utility—pattern matching at sufficient scale is extraordinarily powerful—but it does clarify their limits. The path to artificial general intelligence, if it exists, likely requires more than scaling up today's architectures. It may require something closer to the embodied, causal, continually learning systems that biology spent half a billion years refining. Until then, we are left with tools that can mimic understanding without possessing it, a useful trick but not the same thing as thought.




