Ground Truth: Geotypical vs. geospecific world models: How they differ and why they matter
Much of today’s AI progress has been driven by large language models trained on text. But as research expands beyond words, a new class of systems is emerging — ones that can understand and interact with the physical world itself. These are known as world models, and they’re opening new possibilities for how both people and robots perceive and navigate reality.
At Niantic Spatial, Google DeepMind, NVIDIA, and others, world models are being developed to bridge the gap between digital intelligence and the real world.
Within this growing field, two main approaches have taken shape: geotypical and geospecific models. Though their names sound alike, they represent fundamentally different ways of modeling reality. Geotypical models learn the rules of how the world behaves, while geospecific models capture the details of where things are. Together, they form the foundation for the next generation of intelligent, embodied systems.
Geotypical and geospecific models solve different problems
The same kind of raw imagery, such as videos, photos, or 3D scans, can be used to build two very different models:
-
Geotypical models capture the rules of physics and cause-and-effect in the world. They’re trained on video and other sensor data, learning not only what objects look like, but also how they behave over time. For example, they learn how laundry folds, how liquids pour, or how one object blocks another from moving. In that sense, they are like observers of the laws of nature and they can then be used to simulate those rules in new scenarios. They don’t focus on where something is in the real world, but rather on what is possible, what happens next, and how actions ripple forward in time. World Labs and Google Genie 3 are examples of this approach.
-
Geospecific models, by contrast, capture the specifics of a real place. Think of them as precise maps that describe the physical layout of streets, buildings, or interiors. They ground a robot or application in the actual world as it is. Examples of geospecific models include Niantic Spatial and Google Maps.
It’s tempting to think of one as a stepping stone to the other, but in practice they’re separate beasts. A geotypical model doesn’t naturally become geospecific, and vice versa – they follow very different development paths. That said, it’s technically possible that future research could bridge the gap, but now my guess is that each will keep evolving along its own trajectory.
How robots use world models
Robots provide a practical way to demonstrate how these models complement each other. The type of model a robot needs depends on the tasks it’s expected to perform.
1. Simple robots (fixed, repetitive tasks)
A single-purpose robot folding the same garment repeatedly doesn’t need either kind of model. It just executes its programmed routine.
2. Moderately complex robots (variation without travel)
A stationary, laundry-folding robot that handles different fabrics and sizes, and works around humans, benefits from a geotypical model. By observing and simulating how clothes fold, how objects support each other, and how its limbs interact with the environment, it can generalize to new but similar tasks. Because it doesn’t travel, it doesn’t need a geospecific map.
3. Advanced robots (humanoids in structured spaces)
Now imagine a humanoid robot that performs many tasks on a factory floor, including jobs it hasn’t done before. To keep costs low, it doesn’t carry top-of-the-line sensors or massive onboard computing. Here things get more interesting. Such a robot would definitely need a geotypical model to learn the rules of physics, including how things move, how they collide, how they can be manipulated, and to simulate new tasks. It also benefits from a geospecific model of the factory itself. Why? Because the geospecific map provides a baseline of the environment such as a fixed layout of the floor, walls, and equipment called the Prior Map.
When the robot takes in live sensor data, it can compare it against this prior map. The differences highlight what it really needs to pay attention to: moving humans, machines in operation, or other robots. The geospecific model tells it where things are supposed to be, while the geotypical model helps it simulate how things might behave.
Drawing from self-driving vehicles, there are two modes in which such a robot could be designed:
-
“Waymo mode” relies on a prior geospecific map of the roads.
-
“Tesla mode” uses perception, skipping detailed maps.
A factory robot could be designed in either mode, but as complexity grows, the combination of geotypical and geospecific becomes more powerful.
4. General-purpose robots (navigating the real world)
At the most complex end, a humanoid robot that travels outdoors and handles almost any request will almost certainly need both. Geotypical models give it an understanding of how the world works under the rules of physics, while geospecific models ground it in where things are in the real world.
Robots and maps exist on spectrums
It’s important to remember that these categories aren’t rigid boxes. Robots exist on a continuous spectrum of complexity, and the “right” combination of geotypical and geospecific modeling depends heavily on the use case, available hardware, and the philosophy of the robot’s creator.
-
Some designs lean heavily on perception (like Tesla’s cars), relying on real-time sensor input and skipping detailed prior maps.
-
Others lean on prior maps (like Waymo’s approach), using detailed geospecific data to simplify the perception problem.
-
Most fall somewhere in between, balancing the two depending on the tradeoff between cost, performance, and safety.
And just as robots sit on a spectrum, so do the maps themselves. Not every geospecific model is a full 3D “digital twin” of the world. They can range from:
-
Simple vector maps for route planning.
-
Basic SLAM maps (Simultaneous Localization and Mapping), built from a robot’s own sensors for orientation.
-
Rich geospecific models capturing both geometry and semantics of the real world.
Conclusion
World models are evolving along two distinct but complementary paths. One explores the universal principles that govern how things move and interact; the other captures the specificity of real places and objects.
Both approaches are powerful. As robots and intelligent systems become more capable, these approaches will increasingly work in concert — combining understanding with grounding, simulation with situational awareness. Together, they’re shaping the bridge between digital intelligence and the physical world.
Bobby Parikh is the Senior Vice President of Engineering at Niantic Spatial.