The most impressive demonstrations of artificial intelligence share a curious pattern: they all happen on screens. ChatGPT writes poetry, Midjourney conjures photorealistic images, AlphaFold predicts protein structures. But ask the most advanced AI system to fold a fitted sheet, and you will wait a very long time.
This is not a trivial observation. The gap between linguistic sophistication and physical competence reveals something fundamental about what today's AI actually is — and what it decidedly is not. We have built machines that can discuss thermodynamics eloquently but cannot reliably distinguish a coffee mug from a soup bowl in an unfamiliar kitchen.
The knowledge that cannot be written down
Humans possess what philosophers call tacit knowledge — the kind of understanding that resists explicit articulation. You know how to catch a ball, but try writing instructions precise enough for someone who has never seen one thrown. You know that eggs are fragile and pillows are not, that ice is slippery and sandpaper is rough, that a child running toward a street demands immediate attention.
This knowledge comes from years of embodied experience: dropping things, stubbing toes, burning fingers, misjudging distances. Large language models, by contrast, learn exclusively from text. They know that "glass breaks" because that phrase appears in their training data, but they have never heard the sound, never felt the sudden absence of weight as a cup slips, never swept up the shards.
The result is a peculiar kind of intelligence — vast in breadth, shallow in depth. These systems can explain the physics of projectile motion but struggle with questions a five-year-old handles effortlessly. Which weighs more: a pound of feathers or a pound of steel? They know the answer. But ask which is harder to carry up a flight of stairs, and the reasoning becomes uncertain.
Why robotics lags so far behind
The contrast with language is instructive. Text is abundant, structured, and cheap to process. The internet contains trillions of words, and each one is already digitized. Physical experience is none of these things. A robot learning to grasp objects must actually grasp them — slowly, expensively, one at a time. There is no shortcut through Wikipedia.
This explains why autonomous vehicles, despite billions in investment, still struggle with edge cases that human drivers handle instinctively. A plastic bag blowing across the road, a child's ball rolling from between parked cars, the subtle body language of a pedestrian about to step off a curb — these situations require the kind of intuitive physics and social reasoning that emerges from living in a body, among other bodies, for years.
The companies building household robots face an even harder problem. Kitchens are not standardized. Drawers stick. Handles vary. The same task — making a sandwich — looks completely different in every home. Humans adapt without thinking; machines must be taught each variation explicitly, or learn to generalize from principles they do not truly possess.
The hype gap
None of this diminishes what language models have achieved. They are remarkable tools for writing, summarizing, translating, and brainstorming. They can accelerate research, democratize expertise, and handle routine cognitive tasks with impressive fluency. But the breathless predictions of imminent artificial general intelligence tend to ignore this embodiment problem entirely.
True general intelligence — the kind that could navigate a novel environment, improvise with unfamiliar tools, or care for a child — requires understanding the world, not just describing it. That understanding comes from interaction, from consequence, from the slow accumulation of physical intuition that begins in infancy and never really stops.
Our take
The current AI moment is genuinely transformative, but the transformation is narrower than the marketing suggests. We have built extraordinary language machines, and we should use them for what they are good at. But the next time someone promises that artificial general intelligence is five years away, ask them a simple question: can it pack a suitcase for a weekend trip? The answer will be revealing.




