Mentatcurated
AI Agents

AI Agents

Models that take actions, not just answer — tool use, long-horizon tasks, agent swarms.

State of the world · updated June 2026

Right now: agents reliably handle bounded, well-scoped tasks — coding, browsing, data wrangling — but still wobble on long, open-ended horizons. The frontier is reliability, not raw capability.

Watch: error-recovery loops, multi-agent coordination, and honest evals. The leaderboards are saturating, so the live question is what "reliable" actually means.

Start here · the primer

Agents are language models wired to do things — call tools, browse, write and run code, then check their own work. The leap worth watching is reliability over long horizons: an agent that stays coherent across hundreds of steps.

Latest in this theme

Granola
granola.ai