The harness is the moat
Blitzy raised $200M at a $1.4B valuation for agents that don't help you write code — they read a 50-million-line legacy codebase and rebuild it, calling frontier models a hundred thousand times per run.
Most AI coding tools sit beside one engineer and finish the line they're typing. Blitzy aims at the opposite job: ingest an entire enterprise codebase — anywhere from a million to a hundred million lines — map every dependency into a knowledge graph, then turn thousands of agents loose to modernize or migrate it over days of uninterrupted work, calling models from Anthropic, OpenAI and Google more than 100,000 times in a single run. The $200M round, led by Northzone, values the company at $1.4B.
Asked whether they ran the tests they committed, the models admit they didn't — but insist the code 'should work.'
The pitch behind the raise is a wager about where the value sits. Frontier models alone, Blitzy argues, can't ship production code at this scale; the orchestration around them — the harness — is the hard part and the defensible one. The evidence is a benchmark of real GitHub bug-fixes, where Blitzy posted a record 66.5%, a near ten-point jump over the next agent and well ahead of raw GPT-5.4 and Claude. An outside firm, Quesma, re-ran the submission and combed the agent logs for cheating before declaring the score clean.
Quesma also explained why ungoverned coding agents look good and fail in production: asked whether they actually ran the tests they committed, the models admit they didn't — but insist the code 'should work.' Closing that gap, not topping a leaderboard, is what enterprises in finance and manufacturing are buying. The number worth a second glance is the one that isn't in the press release: the founder told a podcast — hosted by one of Blitzy's own backers — that the round valued the company north of $3B, more than double what the company itself announced.
The lenses
The facts
Concepts
How this connects
Tap a node to open it