First Impressions Of Agent Teams
The Team
I spent the weekend frantically hacking away at FreedomRPG (as per usual), but this time with Claude Code agent teams at my disposal.
I was hesitant to try it because “experimental” features are usually pretty dire in CC. When subagents were released they were less than useless: it was like unleashing a gang of toddlers on your codebase.
Nevertheless, a coworker reminded me that I pay a hefty subscription fee so I should at least try it out, and here we are.
If I understand the difference right:
- Subagents: One co-ordinator agent comes up with N tasks and hands them off to N new agents, who run until either the task is completed or it becomes clear the task was impossible.
- Teams: One co-ordinator agent comes up with N dependency ordered tasks. Then, they create a ‘team’ of at most N agents and let the agents pick from the task list at their own discretion.
There are two other big differentiators:
Agents in a team have a global workspace of some kind in which they can communicate. Supposedly this means that they can inform each other of key decisions and workarounds that affect other parts of the codebase. Honestly, from watching the logs it’s not exactly clear to me what this communication looks like. I wish we had more observability into CC internals.
The other is that agents can be specialized into roles. They can use a generic agent prompt, or pre-specified prompts. Anthropic suggests specialists such as ‘architect’, ‘security reviewer’, and ‘devil’s advocate’. Those seem pretty esoteric to me, so I mostly stuck to roles like ‘frontend’, ‘backend’, etc.
The Wins
The biggest win of the weekend was a 10,000 line change replacing the UUID-based entity system with a filesystem-inspired architecture. It took around 45 minutes to draft the plan, which an agent team one-shotted the implementation in 30 minutes with only a few minor issues, which themselves were caught by end-to-end tests and promptly fixed by the co-ordinator. From experience, a single agent taking on a task of this size is very likely to make serious errors (forgetting about a design change from before a context compaction, for example).
Dividing tasks by domain provides an obvious benefit: agent context isn’t polluted by a bunch of ‘unrelated’ (even though they are part of the changeset) content. You can’t just use subagents for this because sometimes there are dependencies between domains that need communication, which is (supposedly) where the agent teams communication feature comes in.
I think (and plan to test) that agent teams currently let you fit a lot more into a single plan and still get a solid execution out of it. The current functional plan size limit for a Claude Opus 4.6 agent is whatever it can complete within one context window. Larger plans are not impossible, but you start to lose the reliability you need to work effectively above that. Agent teams let you multiply that out to one context per team member.
And that’s about it. Out of 32 commits I made over the weekend (maybe 10 of which used agent teams), only one of them was noticeably improved.
The ‘Meh’
Agent teams are bloody slow. Again, I have little observability into what the agents are actually spending all their time doing, but for anything smaller than 10k lines of code the administration overhead just eats any time savings (and your usage quotas). This is particularly annoying because, as a human, most of the things I do aren’t 10k LOC changes. That just isn’t the scale at which I think about individual software development tasks. If I could trust the agents to ask me questions about all the key decisions then maybe I could start thinking on that scale, but I don’t want 10k line Git commits either! Maybe Opus 5.
I added a section instructing the agent how to best make use of
subagents and agent teams to my global CLAUDE.md.
CLAUDE.md parallelization section
<task_parallelization>
Choose the right execution strategy based on the nature and scope of the changes:
**Sequential (no parallelization):** For quick, simple changes — a single commit, editing one file, a small refactor. The overhead of coordination isn't worth it.
**Subagents:** Use when there are 3+ isolated, independent changes that don't interact with each other. Each subagent handles one self-contained change and reports back. Examples: updating config in several unrelated files, fixing the same type of issue across multiple modules, adding logging to independent services.
**Agent teams:** Use when changes have interacting components that need coordination. Examples: implementing a feature that spans multiple functions calling each other, adding an API endpoint with corresponding client code and tests, refactoring a system where changes in one module affect another. Teammates can discuss approach, share findings, and coordinate on interfaces.
</task_parallelization>This is the direction I see us going. Humans shouldn’t need to think about “oh I really want a devil’s advocate agent for this task”. I should say to my co-ordinator agent, “hey, gimme this thing” and it’s the co-ordinator agent’s job to figure out how to delegate that. Unfortunately it was used maybe 3 times over the course of the weekend. Claude proactively used a team once, and subagents twice. To be fair, most of the tasks were genuinely too small for a team or even subagents to be useful.
I didn’t notice any more mistakes in work done by agent teams than by an individual agent. Maybe this is a win when compared to the subagents release.
So What?
We will see more of agent teams. Parallelization is an obvious optimization for executing large plans. Dynamically allocating tasks and providing communication infrastructure are prerequisites to that.
Agent teams kind of suck for now, they are slow, expensive, and provide little overall uplift. Harnesses will get better, models will get faster, and models will be trained specifically to make the most use of agent teams (as both a co-ordinator and a team member). Then they won’t kind of suck.
- omegastick