Skip to main content

The Parallel Agent Pattern

9 min read

The sequential agent pattern chains agents into a pipeline where each step feeds the next. That works when there’s a natural order. But what if the steps don’t depend on each other?

If Agent A doesn’t need anything from Agent B, making one sit idle while the other finishes is just wasted clock cycles. Fire them all at once, collect the results, move on. That’s the parallel agent pattern.

What is the parallel agent pattern?

Two phases:

  • Fan out. Launch all agents at the same time. Each one works on its own slice of the problem, with its own tools and prompt.
  • Fan in. Wait for every agent to finish, then hand their combined outputs to an aggregator that synthesises a final response.

The agents don’t talk to each other. No shared state during execution. Coordination only kicks in at the start (distributing the input) and at the end (collecting the outputs).

When does this beat sequential?

Sequential is the right call when Step B needs the output of Step A. But when steps are independent, running them one after another just burns time. If each translation takes 1 second, 3 sequential translations take 3 seconds. 3 parallel translations still take 1 second.

Good fit when:

  • Tasks are independent. No agent needs another agent’s output.
  • Latency matters. Wall-clock time equals the slowest agent, not the sum of all agents.
  • The final result needs multiple perspectives. Translations, alternative analyses, or votes that get stitched together at the end.
User Input ParallelAgent Agent 1 French MyMemory API Agent 2 Japanese MyMemory API Agent 3 Spanish MyMemory API Aggregator Gemini 2.5 Flash Lite All three agents run simultaneously via ParallelAgent. Fan out, translate, fan in, summarise.

The project

We’ll build a translation pipeline with Google’s Agent Development Kit (ADK). You type a phrase, 3 translation agents fire in parallel (French, Japanese, Spanish) using the MyMemory API, then an aggregator powered by Gemini presents the results with a linguistic note.

ADK’s ParallelAgent handles the fan-out. A SequentialAgent bolts the parallel step to the aggregator.

Click Run to see all 3 agents fire at once:

$

Setup

The package.json:

{
  "name": "parallel-agent",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "start": "npx adk run --log_level ERROR agent.ts",
    "web": "npx adk web agent.ts"
  },
  "dependencies": {
    "@google/adk": "^0.5.0",
    "zod": "^4.3.6"
  }
}

Two dependencies: @google/adk (bundles the GenAI SDK under the hood) and zod for tool parameter schemas. adk run executes the agent from the command line; adk web spins up ADK’s built-in web UI for testing.

MyMemory is free, no API key needed. Only the Gemini-powered agents need GOOGLE_API_KEY.

The code

agent.ts in full:

import { FunctionTool, LlmAgent, ParallelAgent, SequentialAgent } from "@google/adk";
import { z } from "zod";

// --- Translation Tool ---

const translate = new FunctionTool({
  name: "translate_text",
  description: "Translate text from English to a target language using the MyMemory translation API",
  parameters: z.object({
    text: z.string().describe("The English text to translate"),
    target_lang: z.string().describe("The target language code (e.g. fr, ja, es)"),
  }),
  execute: async ({ text, target_lang }) => {
    const url = `https://api.mymemory.translated.net/get?q=${encodeURIComponent(text)}&langpair=en|${target_lang}`;
    const res = await fetch(url);
    const data = (await res.json()) as any;
    return { translation: data.responseData.translatedText };
  },
});

// --- Translation Sub-Agents (run in parallel) ---

const frenchAgent = new LlmAgent({
  name: "FrenchTranslator",
  model: "gemini-2.5-flash-lite",
  description: "Translates the user's text into French.",
  instruction: `You are a French translator. Whatever the user says, treat their entire message as text to translate. Immediately call the translate_text tool with the user's exact message as the text and target_lang "fr". Return only the translated text, nothing else.`,
  tools: [translate],
  outputKey: "french_translation",
});

const japaneseAgent = new LlmAgent({
  name: "JapaneseTranslator",
  model: "gemini-2.5-flash-lite",
  description: "Translates the user's text into Japanese.",
  instruction: `You are a Japanese translator. Whatever the user says, treat their entire message as text to translate. Immediately call the translate_text tool with the user's exact message as the text and target_lang "ja". Return only the translated text, nothing else.`,
  tools: [translate],
  outputKey: "japanese_translation",
});

const spanishAgent = new LlmAgent({
  name: "SpanishTranslator",
  model: "gemini-2.5-flash-lite",
  description: "Translates the user's text into Spanish.",
  instruction: `You are a Spanish translator. Whatever the user says, treat their entire message as text to translate. Immediately call the translate_text tool with the user's exact message as the text and target_lang "es". Return only the translated text, nothing else.`,
  tools: [translate],
  outputKey: "spanish_translation",
});

// --- Parallel Agent ---

const parallelTranslator = new ParallelAgent({
  name: "ParallelTranslator",
  description: "Runs French, Japanese, and Spanish translators in parallel.",
  subAgents: [frenchAgent, japaneseAgent, spanishAgent],
});

// --- Aggregator Agent ---

const aggregatorAgent = new LlmAgent({
  name: "TranslationAggregator",
  model: "gemini-2.5-flash-lite",
  description: "Aggregates parallel translation results and provides linguistic insights.",
  instruction: `You are an aggregator in a parallel agent pipeline. You receive three translations of the same word or phrase. Present them clearly, then add a one-sentence note on any interesting linguistic differences between them.

French: {french_translation}
Japanese: {japanese_translation}
Spanish: {spanish_translation}`,
  tools: [],
  outputKey: "aggregated_result",
});

// --- Root Agent: Sequential(Parallel → Aggregator) ---

export const rootAgent = new SequentialAgent({
  name: "ParallelTranslationPipeline",
  description: "Translates text into 3 languages in parallel, then aggregates the results.",
  subAgents: [parallelTranslator, aggregatorAgent],
});

Let’s break it down.

The translation tool

const translate = new FunctionTool({
  name: "translate_text",
  description: "Translate text from English to a target language using the MyMemory translation API",
  parameters: z.object({
    text: z.string().describe("The English text to translate"),
    target_lang: z.string().describe("The target language code (e.g. fr, ja, es)"),
  }),
  execute: async ({ text, target_lang }) => {
    const url = `https://api.mymemory.translated.net/get?q=${encodeURIComponent(text)}&langpair=en|${target_lang}`;
    const res = await fetch(url);
    const data = (await res.json()) as any;
    return { translation: data.responseData.translatedText };
  },
});

FunctionTool wraps a plain function so agents can call it. Parameters are defined with zod, giving you runtime validation and type inference in one shot.

Notice this tool doesn’t touch Gemini. A direct API call is faster, cheaper, and more deterministic than routing through an LLM. Keep the model out of it when you don’t need the model.

The translation sub-agents

const frenchAgent = new LlmAgent({
  name: "FrenchTranslator",
  model: "gemini-2.5-flash-lite",
  description: "Translates the user's text into French.",
  instruction: `You are a French translator. Whatever the user says, treat their entire message as text to translate. Immediately call the translate_text tool with the user's exact message as the text and target_lang "fr". Return only the translated text, nothing else.`,
  tools: [translate],
  outputKey: "french_translation",
});

Each translator is an LlmAgent with its own instruction and a reference to the shared translate tool.

The outputKey is the piece that wires everything together. It stashes the agent’s final response in a state variable (e.g. french_translation) that downstream agents can read. All 3 translators share the same tool; they only differ in instruction and target language.

The parallel dispatch

const parallelTranslator = new ParallelAgent({
  name: "ParallelTranslator",
  description: "Runs French, Japanese, and Spanish translators in parallel.",
  subAgents: [frenchAgent, japaneseAgent, spanishAgent],
});

That’s the entire fan-out. ParallelAgent fires all 3 sub-agents at the same time and waits for every one to finish. No Promise.all, no manual concurrency plumbing. Wall-clock time equals the slowest agent, not the sum.

The aggregator

const aggregatorAgent = new LlmAgent({
  name: "TranslationAggregator",
  model: "gemini-2.5-flash-lite",
  description: "Aggregates parallel translation results and provides linguistic insights.",
  instruction: `You are an aggregator in a parallel agent pipeline. You receive three translations of the same word or phrase. Present them clearly, then add a one-sentence note on any interesting linguistic differences between them.

French: {french_translation}
Japanese: {japanese_translation}
Spanish: {spanish_translation}`,
  tools: [],
  outputKey: "aggregated_result",
});

The fan-in step. See the {french_translation}, {japanese_translation}, and {spanish_translation} template variables in the instruction? Those map directly to the outputKey values from the parallel sub-agents. ADK swaps them for the actual results at runtime.

The LLM earns its keep here: synthesising 3 translations and spotting linguistic quirks between them.

The root agent

export const rootAgent = new SequentialAgent({
  name: "ParallelTranslationPipeline",
  description: "Translates text into 3 languages in parallel, then aggregates the results.",
  subAgents: [parallelTranslator, aggregatorAgent],
});

This stitches the 2 stages together. Why a SequentialAgent and not a ParallelAgent or an LlmAgent?

If you made the root a ParallelAgent, the aggregator would run alongside the translators. The {french_translation} variables would be empty because the translations haven’t landed yet. If you used an LlmAgent, you’d burn an extra LLM call just to decide “run the translations, then aggregate”, (which is the same thing every single time).

SequentialAgent gives you a deterministic “run A then B” with zero LLM overhead. First the ParallelAgent fans out to 3 translators, then the aggregator fans in and synthesises.

Exporting rootAgent is how ADK knows which agent to run when you hit npx adk run.

Why not just call the API 3 times?

Fair question. You could skip the agents entirely and hammer the translation API with three fetch calls inside a Promise.all. Here’s what you’d lose:

  • Speed. All 3 translations run at the same time. Nobody waits for French to finish before Japanese starts.
  • Composability. Each agent is self-contained with its own instruction, tools, and output key. Want a 4th language? Drop one more LlmAgent into the subAgents array. No rewiring.
  • Synthesis. The aggregator spots structural differences across languages that a raw API call can’t.

When to reach for this pattern

  • Sub-tasks are independent. No agent needs another’s output. If they do, use sequential.
  • Latency matters. Wall-clock time is the slowest agent, not the total.
  • Results benefit from aggregation. Multiple perspectives, translations, or analyses folded into one output.

Wrong fit when tasks have dependencies (sequential), when the next step hinges on a condition (routing), or when you need one agent to critique another (reflection). We’ll cover those in later posts.

Wrapping up

Fan out, do the work, fan in. ParallelAgent handles the concurrent execution, SequentialAgent chains it to the aggregator, and outputKey passes results between agents without manual wiring.

The hard part isn’t the concurrency. It’s designing an aggregator that pulls the results together in a way that’s actually useful.

With single, sequential, and parallel agents you’ve got 3 building blocks that cover a lot of ground. Next up: patterns where agents make decisions about which path to take.