AI Agents
AI Agents
Models that take actions, not just answer — tool use, long-horizon tasks, agent swarms.
State of the world · updated June 2026
Right now: agents reliably handle bounded, well-scoped tasks — coding, browsing, data wrangling — but still wobble on long, open-ended horizons. The frontier is reliability, not raw capability.
Watch: error-recovery loops, multi-agent coordination, and honest evals. The leaderboards are saturating, so the live question is what "reliable" actually means.
Start here · the primer
Agents are language models wired to do things — call tools, browse, write and run code, then check their own work. The leap worth watching is reliability over long horizons: an agent that stays coherent across hundreds of steps.