Agent Teams with Claude: Orchestrators, Specialists, and Real Results

A single Claude agent can do a lot. But when a task has genuinely distinct phases — research, implementation, review — a team of specialized agents beats one generalist every time.

Here's how I build agent teams in TypeScript, what patterns work in production, and when to stay with a single agent.

What Agent Teams Actually Are

An agent team is an orchestrator that delegates subtasks to specialist agents. Each specialist has a focused system prompt, a scoped set of tools, and a single responsibility.

The orchestrator doesn't do the work — it breaks the problem down, routes work to the right specialist, and assembles the final result.

User Request
     │
     ▼
┌─────────────┐
│ Orchestrator │  ← decides what work goes where
└──────┬──────┘
       │
  ┌────┴────┐
  ▼         ▼
Research   Code      ← specialists with focused system prompts
 Agent     Agent
           │
           ▼
         Review      ← can chain into another specialist
          Agent

This isn't just an architectural preference. Specialists perform better because their context window is filled with relevant information, not noise from unrelated phases.

The Pattern in TypeScript

The core abstraction is simple: an agent is a function that takes a task string and returns a result string. The orchestrator calls agents like tools.

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

type Agent = (task: string, context?: string) => Promise<string>;

function createAgent(
  name: string,
  systemPrompt: string,
  model: string = 'claude-sonnet-4-5'
): Agent {
  return async (task: string, context?: string) => {
    const messages: Anthropic.MessageParam[] = [
      {
        role: 'user',
        content: context ? `Context:\n${context}\n\nTask:\n${task}` : task,
      },
    ];

    const response = await client.messages.create({
      model,
      max_tokens: 4096,
      system: systemPrompt,
      messages,
    });

    return response.content[0].type === 'text' ? response.content[0].text : '';
  };
}

Now define the specialists:

const researchAgent = createAgent(
  'researcher',
  `You are a technical research specialist. Given a topic or question, you:
  - Identify the key concepts and constraints
  - Surface relevant patterns and prior art
  - Flag potential pitfalls and edge cases
  - Summarize findings in structured markdown
  Return only your research findings. Do not write code.`,
  'claude-haiku-4-5' // cheaper model for research synthesis
);

const codeAgent = createAgent(
  'coder',
  `You are a senior TypeScript engineer. Given a specification and research context, you:
  - Write production-quality TypeScript
  - Include proper error handling
  - Add JSDoc comments on exported functions
  - Follow the principle: make it work, then make it readable
  Return only the code. No explanation unless requested.`
);

const reviewAgent = createAgent(
  'reviewer',
  `You are a code reviewer focused on production readiness. You check for:
  - Security issues (injection, secrets in code, missing validation)
  - Error handling gaps
  - Performance problems at scale
  - Missing edge cases
  Return a structured review: APPROVED or CHANGES REQUESTED, then bullet points.`
);

The Orchestrator

The orchestrator ties the team together. It doesn't need to be clever — just sequential and explicit:

async function orchestrate(userRequest: string): Promise<{
  research: string;
  code: string;
  review: string;
  approved: boolean;
}> {
  console.log('Phase 1: Research');
  const research = await researchAgent(`Research the technical requirements for: ${userRequest}`);

  console.log('Phase 2: Implementation');
  const code = await codeAgent(
    `Implement the following: ${userRequest}`,
    research // research becomes context for the coder
  );

  console.log('Phase 3: Review');
  const review = await reviewAgent(
    `Review this implementation for production readiness`,
    `Original request: ${userRequest}\n\nResearch:\n${research}\n\nCode:\n${code}`
  );

  const approved = review.toUpperCase().includes('APPROVED');

  return { research, code, review, approved };
}

Usage is a single function call:

const result = await orchestrate(
  'Build a rate limiter middleware for an Express API that limits to 100 requests per IP per minute, stores state in Redis, and returns RFC 7807 error responses'
);

if (result.approved) {
  console.log('Ready for production:\n', result.code);
} else {
  console.log('Needs changes:\n', result.review);
}

Adding Tool Use to Specialists

Specialists become dramatically more useful with tools. A research agent with web search, a code agent with a file system — these change what's possible.

Here's how to add tool use to a specialist:

async function codeAgentWithTools(task: string, context: string): Promise<string> {
  const tools: Anthropic.Tool[] = [
    {
      name: 'read_file',
      description: 'Read an existing file from the project',
      input_schema: {
        type: 'object' as const,
        properties: {
          path: { type: 'string', description: 'Relative file path' },
        },
        required: ['path'],
      },
    },
    {
      name: 'write_file',
      description: 'Write or overwrite a file',
      input_schema: {
        type: 'object' as const,
        properties: {
          path: { type: 'string' },
          content: { type: 'string' },
        },
        required: ['path', 'content'],
      },
    },
  ];

  const messages: Anthropic.MessageParam[] = [
    { role: 'user', content: `Context:\n${context}\n\nTask:\n${task}` },
  ];

  // Agentic loop — runs until no more tool calls
  while (true) {
    const response = await client.messages.create({
      model: 'claude-sonnet-4-5',
      max_tokens: 8192,
      system: 'You are a senior TypeScript engineer...',
      tools,
      messages,
    });

    if (response.stop_reason === 'end_turn') {
      const textBlock = response.content.find((b) => b.type === 'text');
      return textBlock?.type === 'text' ? textBlock.text : '';
    }

    // Process tool calls
    const toolResults: Anthropic.ToolResultBlockParam[] = [];
    for (const block of response.content) {
      if (block.type === 'tool_use') {
        const result = await executeTool(block.name, block.input as Record<string, string>);
        toolResults.push({
          type: 'tool_result',
          tool_use_id: block.id,
          content: result,
        });
      }
    }

    // Feed results back into the conversation
    messages.push({ role: 'assistant', content: response.content });
    messages.push({ role: 'user', content: toolResults });
  }
}

Real-World Example: Automated PR Review Pipeline

I use an agent team to review pull requests before human review. It runs on every PR via GitHub Actions:

PR Opened
    │
    ▼
Diff Fetcher     ← fetches the raw diff from GitHub API
    │
    ▼
Context Agent    ← reads related files, understands the change
    │
    ▼
Security Agent   ← checks for secrets, injections, auth gaps
    │
    ▼
Logic Agent      ← checks correctness, edge cases, test coverage
    │
    ▼
Summary Agent    ← writes the final PR comment in human voice
    │
    ▼
GitHub API       ← posts the comment to the PR

The whole pipeline takes about 45 seconds and catches real issues — hardcoded credentials, missing null checks, untested error paths. Human reviewers focus on architecture and intent, not the mechanical stuff.

async function reviewPullRequest(prNumber: number, repo: string) {
  const diff = await fetchGitHubDiff(prNumber, repo);

  // Run security and logic checks in parallel — no dependency between them
  const [securityFindings, logicFindings] = await Promise.all([
    securityAgent(`Review this diff for security issues:\n${diff}`),
    logicAgent(`Review this diff for logic errors and missing edge cases:\n${diff}`),
  ]);

  const summary = await summaryAgent(
    'Write a concise PR review comment',
    `Security findings:\n${securityFindings}\n\nLogic findings:\n${logicFindings}`
  );

  await postGitHubComment(prNumber, repo, summary);
}

Parallel execution is key. Security and logic checks are independent — running them concurrently cuts wall-clock time in half.

Cost Breakdown: Single Agent vs. Team

The cost difference is real and worth understanding before committing to a multi-agent architecture.

| Setup | Model | Input tokens | Output tokens | Cost per run | | ------------------------- | ------ | ------------ | ------------- | ------------ | | Single agent (all-in-one) | Sonnet | ~4,000 | ~2,000 | ~$0.030 | | Team: Haiku researcher | Haiku | ~1,500 | ~800 | ~$0.001 | | Team: Sonnet coder | Sonnet | ~3,000 | ~2,000 | ~$0.024 | | Team: Sonnet reviewer | Sonnet | ~5,000 | ~500 | ~$0.017 | | Team total | | | | ~$0.042 |

Agent teams cost more. That's the honest answer. You pay for the additional context and model calls.

The trade-off: quality and maintainability. A specialist coder with research context produces better output than one agent trying to do everything in a single pass. For a one-off script, use a single agent. For a production automation running thousands of times a week, the quality improvement justifies the cost.

When to Use Agent Teams (and When Not To)

Use a team when:

·The task has clearly distinct phases with different skill requirements
·You need parallel workstreams (security check AND logic check simultaneously)
·Different phases need different models — cheap model for research, powerful model for implementation
·The task is complex enough that a single context window gets cluttered
·You need auditability — separate agent logs make it easy to see where a failure happened

Stay with a single agent when:

·The task is self-contained and well-defined
·Latency matters — each agent hop adds 5-30 seconds
·You're prototyping and coordination overhead isn't worth it yet
·The task costs $0.01 and adding a team would make it $0.05 with no quality gain

The rule I use: if I could hand the task to a single skilled person and they could do it well, one agent. If I'd naturally assign it to a team with a project manager, architect, and reviewer — use a team.

Patterns Worth Stealing

Pattern 1: Context compression between agents. Don't pass raw output from agent A into agent B's full context. Add a summary step: "Summarize your findings in 200 words for the next agent." Cuts costs and reduces noise.

Pattern 2: Confidence scoring. Have the reviewer return a score from 1-10, not just APPROVED/REJECTED. Use the score to decide whether to retry with more capable models or escalate to a human.

Pattern 3: Specialist model routing. Research and summarization work fine on Haiku. Code generation and review need Sonnet. Wire the right model to each specialist and you can cut team cost by 30-40% without quality loss.

Pattern 4: Retry on specific failure modes. If the reviewer returns CHANGES REQUESTED, re-run the coder with the review feedback as additional context. One retry loop catches most issues before they reach a human.

async function orchestrateWithRetry(request: string, maxRetries = 2) {
  let context = await researchAgent(request);

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const code = await codeAgent(request, context);
    const review = await reviewAgent(request, `${context}\n\nCode:\n${code}`);

    if (review.toUpperCase().includes('APPROVED')) {
      return { code, review, attempts: attempt + 1 };
    }

    // Add review feedback to context for next attempt
    context = `${context}\n\nPrevious attempt failed review:\n${review}`;
  }

  throw new Error(`Did not pass review after ${maxRetries + 1} attempts`);
}

What I'd Build Next

The pattern I haven't fully explored: long-running agent teams with persistent memory. Each specialist writes structured notes to a shared KV store that persists across runs. The research agent builds a knowledge base over time. The code agent learns patterns from past implementations.

That's the jump from stateless pipelines to agents that actually improve with use.

For now, stateless teams are already useful enough to ship.

This is template #2 in the PowerAI weekly series. Every week I publish a new production-ready automation with architecture diagrams, cost breakdowns, and working code.

Subscribe to get the next one.