Genie 3
DeepMind's new world model turns a sentence into a place you can walk through — and the breakthrough isn't the picture, it's the persistence.
Type a description — 'a snow-covered village at dusk' — and Genie 3 renders not an image but an environment: a navigable 3D scene you can move through in real time. Earlier systems could generate a frame; this generates a world, and keeps it consistent as you explore it.
It doesn't generate a frame. It generates a world, and remembers it.
That consistency is the hard part. Walk behind a building and come back, and the building is still there. Objects obey rough physics. The model is holding a coherent internal state of a place it invented on the fly — a very different thing from predicting the next pixel.
The reason it matters beyond the spectacle: a model that can simulate a consistent world is a model an agent can practise in. Robotics and game-playing systems are bottlenecked on data and on safe places to fail. A good-enough world model is an infinite, controllable training ground.
World models are the missing substrate for embodied AI. If they get good enough to train agents inside, the data bottleneck that's held back robotics starts to loosen — which is why this is genuinely new, not just impressive.