Skip to main content

The Loop Agent Pattern

14 min read

Sequential pipelines run once. Parallel pipelines fan out and gather. Both assume the first attempt is good enough. Often, it isn’t.

LLMs hallucinate. They’ll invent songs that never existed, repeat years you told them to avoid, quietly ignore constraints buried three paragraphs deep. The loop agent pattern bolts a critique step onto your pipeline and lets it circle back, fixing mistakes until the output actually holds up (or you hit a ceiling and walk away).

What is the loop agent pattern?

A loop runs a set of sub-agents repeatedly until one of them pulls the handbrake. Each iteration does three things:

  • Evaluate. A critic agent holds the current output against a set of criteria.
  • Decide. Pass? Exit. Fail? Keep going.
  • Refine. A refiner agent patches whatever the critic flagged, then the loop spins again.

ADK’s LoopAgent handles the iteration mechanics. A built-in EXIT_LOOP tool gives agents the power to break out when the work is done. You set a maxIterations cap so the thing can’t run forever.

User Input (mood) GeneratorAgent Generate Playlist LoopAgent (max 3 iterations) CriticAgent Evaluate + Verify MusicBrainz API RefinerAgent Fix or Exit EXIT_LOOP FAIL: loop back Verified Playlist PASS: exit loop. FAIL: refine and retry.

When does this beat a single pass?

Whenever correctness matters more than speed. A single LLM call might land the right answer 70% of the time. A critique loop catches the other 30%. You pay for it in latency: each iteration burns another round of API calls. But for tasks where “close enough” falls short, the loop earns its keep.

Good fit when:

  • Output has verifiable criteria. You can check whether a song exists, whether a year is duplicated, whether a list has exactly 10 items.
  • The LLM’s first attempt is close but not clean. The refiner only needs to swap a few tracks, not rebuild from nothing.
  • You have an external source of truth. An API, a database, a schema. Something the critic can measure against that isn’t just another LLM’s opinion.

The project

We’re building a Playlist Curator. Type a mood, get a 10-song playlist. The generator picks songs. A critic verifies every track against the MusicBrainz database and checks all the rules. A refiner patches any failures. The loop runs until the critic says PASS (or 3 iterations, whichever lands first).

Click Run to watch the critique loop in action:

$

Setup

The package.json:

{
  "name": "loop-agent",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "start": "npx adk run --log_level ERROR agent.ts",
    "web": "npx adk web agent.ts"
  },
  "dependencies": {
    "@google/adk": "^0.6.0",
    "zod": "^4.3.6"
  }
}

Two dependencies. @google/adk and zod. Same as every post in this series. We’re using ADK 0.6+ which adds support for combining outputSchema with tools on the same agent (more on that below).

The code

agent.ts in full:

import {
  FunctionTool,
  LlmAgent,
  SequentialAgent,
  LoopAgent,
  EXIT_LOOP,
} from "@google/adk";
import { z } from "zod";

// --- MusicBrainz Verify Song Tool ---

const verifySong = new FunctionTool({
  name: "verify_song",
  description:
    "Verify that a song exists by searching the MusicBrainz database. Returns whether the song was found and its track info.",
  parameters: z.object({
    song: z.string().describe("The song title to search for"),
    artist: z.string().describe("The artist name"),
  }),
  execute: async ({ song, artist }) => {
    try {
      const query = encodeURIComponent(`recording:"${song}" AND artist:"${artist}"`);
      const res = await fetch(
        `https://musicbrainz.org/ws/2/recording?query=${query}&limit=1&fmt=json`,
        {
          headers: {
            "User-Agent": "PlaylistCuratorAgent/1.0 (playlist-curator-agent)",
          },
        }
      );

      if (!res.ok) {
        return { found: false, error: `MusicBrainz API error: ${res.status}` };
      }

      const data = (await res.json()) as any;
      const recordings = data?.recordings;

      if (!recordings || recordings.length === 0) {
        return { found: false, track: null };
      }

      const recording = recordings[0];
      const score = recording.score ?? 0;

      if (score < 80) {
        return { found: false, track: null, reason: `Low match score: ${score}` };
      }

      return {
        found: true,
        track: {
          title: recording.title,
          artist: recording["artist-credit"]?.[0]?.name,
          album: recording.releases?.[0]?.title,
          date: recording["first-release-date"],
          score,
        },
      };
    } catch (err: any) {
      return { found: false, error: err.message };
    }
  },
});

// --- Schemas ---

const SongSchema = z.object({
  title: z.string().describe("The song title"),
  artist: z.string().describe("The artist name"),
  year: z.number().describe("The release year"),
  why: z.string().describe("One-line explanation of why this song fits the mood"),
});

const PlaylistSchema = z.object({
  songs: z.array(SongSchema).describe("The list of songs in the playlist"),
});

const CritiqueSchema = z.object({
  verdict: z.enum(["PASS", "FAIL"]).describe("Whether the playlist passes all criteria"),
  issues: z.array(z.string()).describe("Description of each issue found (empty array if PASS)"),
  songVerifications: z.array(
    z.object({
      title: z.string().describe("The song title"),
      artist: z.string().describe("The artist name"),
      verified: z.boolean().describe("Whether the song was verified on MusicBrainz"),
    })
  ).describe("Verification results for each song"),
});

// --- Agents ---

const generatorAgent = new LlmAgent({
  name: "GeneratorAgent",
  model: "gemini-3-flash-preview",
  description: "Generates an initial 10-song playlist for a given mood.",
  instruction: `You are a music expert and playlist curator. The user will provide a mood or vibe.

Generate a playlist of exactly 10 songs that match that mood. Choose real, well-known songs that actually exist.

Rules:
- Exactly 10 songs
- No artist should appear more than twice
- No two songs should be from the same year
- Each song must genuinely fit the mood/vibe
- Include a variety of genres and eras when possible`,
  tools: [],
  outputSchema: PlaylistSchema,
  outputKey: "playlist",
});

const criticAgent = new LlmAgent({
  name: "CriticAgent",
  model: "gemini-3-flash-preview",
  description:
    "Evaluates the playlist against all criteria and verifies songs via MusicBrainz.",
  instruction: (context) => {
    const playlist = context.state["playlist"] ?? "No playlist yet.";
    return `You are a strict playlist critic. Your job is to evaluate the current playlist against ALL of the following criteria:

1. Exactly 10 songs
2. No artist appears more than twice
3. No two songs share the same year
4. Mood/vibe coherence — every song should genuinely fit the original mood
5. Each song must be verified as real via the verify_song tool (MusicBrainz) — call it for EVERY song
6. Each entry must include: song title, artist, year, and a one-line "why it fits"

Here is the current playlist:
${JSON.stringify(playlist)}

Steps:
1. Parse the playlist
2. Check criteria 1-4 and 6 by inspecting the data
3. Call verify_song for each song to check criterion 5
4. Compile your findings

Be thorough and strict. Only output PASS if ALL criteria are met.`;
  },
  tools: [verifySong],
  outputSchema: CritiqueSchema,
  outputKey: "critique",
});

const refinerAgent = new LlmAgent({
  name: "RefinerAgent",
  model: "gemini-3-flash-preview",
  description:
    "If critique passes, exits the loop. If it fails, refines the playlist based on feedback.",
  instruction: (context) => {
    const playlist = context.state["playlist"] ?? "No playlist yet.";
    const critique = context.state["critique"] ?? "No critique yet.";
    return `You are a playlist refiner. You receive a playlist and its critique.

Current playlist:
${JSON.stringify(playlist)}

Critique:
${JSON.stringify(critique)}

If the critique verdict is "PASS", call the exit_loop tool immediately to end the review cycle. Do not output a new playlist in that case.

If the critique verdict is "FAIL", fix ALL issues mentioned:
- Replace any songs that could not be verified on MusicBrainz with real, well-known songs
- Fix duplicate years by swapping songs for ones from different years
- Fix artist over-representation by replacing excess songs from the same artist
- Replace any songs that don't fit the mood
- Ensure all 6 criteria will be satisfied`;
  },
  tools: [EXIT_LOOP],
  outputSchema: PlaylistSchema,
  outputKey: "playlist",
});

// --- Loop & Root ---

const critiqueLoop = new LoopAgent({
  name: "critique_loop",
  subAgents: [criticAgent, refinerAgent],
  maxIterations: 3,
});

export const rootAgent = new SequentialAgent({
  name: "playlist_curator",
  description:
    "Generates a mood-based playlist and iteratively refines it through critique loops with MusicBrainz verification.",
  subAgents: [generatorAgent, critiqueLoop],
});

Let’s pull it apart.

The verification tool

const verifySong = new FunctionTool({
  name: "verify_song",
  description:
    "Verify that a song exists by searching the MusicBrainz database.",
  parameters: z.object({
    song: z.string().describe("The song title to search for"),
    artist: z.string().describe("The artist name"),
  }),
  execute: async ({ song, artist }) => {
    const query = encodeURIComponent(`recording:"${song}" AND artist:"${artist}"`);
    const res = await fetch(
      `https://musicbrainz.org/ws/2/recording?query=${query}&limit=1&fmt=json`,
      { headers: { "User-Agent": "PlaylistCuratorAgent/1.0" } }
    );
    // ... parse and return { found: true/false, track: {...} }
  },
});

This is what keeps the LLM honest. MusicBrainz is a free, open music database. The tool searches for a recording by title and artist, returning found: true only when the match score clears 80. Low-confidence matches get thrown out.

The critic calls this for every single song in the playlist. If the LLM invented a track, this catches it.

Structured output with Zod schemas

const SongSchema = z.object({
  title: z.string().describe("The song title"),
  artist: z.string().describe("The artist name"),
  year: z.number().describe("The release year"),
  why: z.string().describe("One-line explanation of why this song fits the mood"),
});

const PlaylistSchema = z.object({
  songs: z.array(SongSchema).describe("The list of songs in the playlist"),
});

const CritiqueSchema = z.object({
  verdict: z.enum(["PASS", "FAIL"]),
  issues: z.array(z.string()),
  songVerifications: z.array(
    z.object({
      title: z.string(),
      artist: z.string(),
      verified: z.boolean(),
    })
  ),
});

Instead of embedding JSON format instructions in each prompt and hoping the LLM complies, we define Zod schemas and pass them as outputSchema on each agent. ADK converts these into Gemini’s structured output constraints, so the model is forced to produce valid JSON matching the schema at the decoding level. No more “output raw JSON, no markdown fences, nothing else” gymnastics.

One thing to note: Gemini’s structured output requires a top-level object, so the playlist is wrapped in { songs: [...] } rather than a bare array.

This requires Gemini 3 and ADK 0.6+. Earlier Gemini versions (2.5 and below) couldn’t combine structured output with function calling in the same request. Using outputSchema on an agent with tools would silently break. Gemini 3 supports both natively, so the model can call tools (like verify_song) and then deliver its final answer as structured JSON matching your schema, all in one flow.

The generator

const generatorAgent = new LlmAgent({
  name: "GeneratorAgent",
  model: "gemini-3-flash-preview",
  instruction: `You are a music expert and playlist curator...
  Rules:
  - Exactly 10 songs
  - No artist should appear more than twice
  - No two songs should be from the same year
  ...`,
  tools: [],
  outputSchema: PlaylistSchema,
  outputKey: "playlist",
});

The generator fires once, before the loop kicks in. It produces a 10-song playlist as structured JSON. No tools needed; it’s pure LLM generation. The rules are spelled out in the prompt, but (as we’ll see) the LLM doesn’t always follow them on the first attempt. That’s exactly why the loop exists. The outputSchema guarantees the shape of the output (every song has a title, artist, year, and why), while the critique loop enforces the content rules (no duplicate years, songs actually exist, etc.).

The critic

const criticAgent = new LlmAgent({
  name: "CriticAgent",
  model: "gemini-3-flash-preview",
  instruction: (context) => {
    const playlist = context.state["playlist"] ?? "No playlist yet.";
    return `You are a strict playlist critic...
    Here is the current playlist:
    ${JSON.stringify(playlist)}
    ...`;
  },
  tools: [verifySong],
  outputSchema: CritiqueSchema,
  outputKey: "critique",
});

The critic pulls playlist from session state and inspects every rule: count, duplicate artists, duplicate years, mood fit, and whether each song actually exists in MusicBrainz. The CritiqueSchema guarantees the output shape: a verdict (constrained to "PASS" or "FAIL" via z.enum), a list of issues, and songVerifications for every track.

Notice the instruction is a function, not a template string. With outputSchema, ADK stores the playlist as a parsed JSON object in session state, not a raw string. The {playlist} template syntax doesn’t handle objects well, so we use a callback that reads from context.state directly and serialises with JSON.stringify. This also sidesteps errors when a state key hasn’t been set yet (the fallback value handles that gracefully).

This is where outputSchema + tools working together matters. The critic needs to call verify_song for each song and produce structured output. Gemini 3 handles both natively: the model calls the verification tool 10 times, gets the results, then delivers its critique as structured JSON matching the schema.

The verifySong tool is doing the heavy lifting. Without it, the critic would just be another LLM guessing whether songs are real. With it, you get ground truth from an external database. That’s the difference between a rubber stamp and an actual audit.

The refiner

const refinerAgent = new LlmAgent({
  name: "RefinerAgent",
  model: "gemini-3-flash-preview",
  instruction: (context) => {
    const playlist = context.state["playlist"] ?? "No playlist yet.";
    const critique = context.state["critique"] ?? "No critique yet.";
    return `...
    If the critique verdict is "PASS", call the exit_loop tool immediately.
    If the critique verdict is "FAIL", fix ALL issues mentioned...`;
  },
  tools: [EXIT_LOOP],
  outputSchema: PlaylistSchema,
  outputKey: "playlist",
});

Three things worth noticing.

EXIT_LOOP is an ADK built-in tool. When the refiner calls it, the LoopAgent stops iterating. The refiner decides when to exit based on the critic’s verdict. PASS means call the tool and stop. FAIL means output a revised playlist (overwriting the playlist key in session state) and let the loop go around again.

The refiner writes back to the same outputKey as the generator. Both use "playlist". So the revised playlist overwrites the original. Next time the critic runs, context.state["playlist"] contains the refined version, not the first draft.

outputSchema and EXIT_LOOP coexist. When the verdict is PASS, the refiner calls EXIT_LOOP and produces no structured output. The schema only constrains the response when the model actually generates one. When the verdict is FAIL, the revised playlist is guaranteed to match PlaylistSchema.

The loop

const critiqueLoop = new LoopAgent({
  name: "critique_loop",
  subAgents: [criticAgent, refinerAgent],
  maxIterations: 3,
});

LoopAgent runs its sub-agents in sequence, repeatedly, until either EXIT_LOOP fires or maxIterations is hit. Three iterations is a sensible cap: if the playlist can’t be fixed in 3 rounds, something is fundamentally wrong and throwing more API calls at it won’t help.

Each iteration runs critic, then refiner. The critic evaluates. The refiner fixes (or exits).

The root

export const rootAgent = new SequentialAgent({
  name: "playlist_curator",
  subAgents: [generatorAgent, critiqueLoop],
});

Same shape as the parallel post: a SequentialAgent at the root. The generator produces the initial playlist. Then the LoopAgent takes over and grinds until it’s satisfied.

How the loop state evolves

The outputKey reuse is what makes the whole loop tick. Here’s what happens to session state across iterations:

  1. Generator runs -> playlist = { songs: [...] } (parsed object, thanks to outputSchema)
  2. Critic (iteration 1) -> reads playlist from state, writes critique = { verdict: "FAIL", issues: [...], ... }
  3. Refiner (iteration 1) -> reads playlist + critique from state, overwrites playlist with fixed version
  4. Critic (iteration 2) -> reads updated playlist, writes new critique with verdict: "PASS"
  5. Refiner (iteration 2) -> reads critique, sees PASS, calls EXIT_LOOP

Because every agent uses outputSchema, the values stored in session state are parsed JSON objects, not raw strings. The instruction callbacks read them with context.state["playlist"] and serialise with JSON.stringify before injecting into the prompt.

The refiner writing back to "playlist" is the trick. It closes the feedback loop without any manual state wiring.

Why verification matters

Without the MusicBrainz tool, you’d just have one LLM generating songs and another LLM nodding along. Both could happily hallucinate together. The external verification tool breaks that echo chamber.

In practice, Gemini 3 Flash gets roughly 8 out of 10 songs right on the first try. The other 2 tend to be real artists with invented track titles, or real songs pinned to the wrong year. The critic flags them, the refiner swaps them out, and the second pass usually lands clean.

When to reach for this pattern

  • Output must meet strict, checkable criteria. If you can write a function (or hand an agent a tool) that returns pass/fail, the loop can enforce it.
  • First-attempt quality is good but not spotless. The loop is for polishing, not for rebuilding from scratch every iteration.
  • You have an external ground truth. APIs, databases, schemas. Something objective the critic can measure against.

Skip it when there’s no objective way to evaluate output (subjective writing quality), when the task is simple enough that one pass reliably works, or when latency is critical and you can’t afford multiple rounds.

Wrapping up

The loop agent pattern bolts self-correction onto your pipeline. LoopAgent handles the iteration, EXIT_LOOP gives agents an escape hatch, and maxIterations prevents runaway loops. The real power comes from pairing the critic with an external verification tool so it’s checking facts, not feelings.

With single, sequential, parallel, and loop agents you’ve now got 4 patterns that cover most of what an agentic system needs to do. Next up: routing, where agents decide which path to take.