Agentic AI: Multi-Agent Systems and Task Handoff

Welcome back to the final chapter of the Agentic AI series. We've explored the inner workings of reflection loops, prompt chaining, and orchestration-worker patterns-but there's still one crucial piece to complete the picture of modern agentic design: multi-agent systems.

Why Multi-Agent Systems Matter

Most AI demos you'll find online showcase a single LLM tackling a broad task end-to-end. Impressive? Sure. Scalable or reliable? Not quite.

In real-world scenarios, no single agent should try to do everything. Instead, specialisation becomes key. We could use any operational entity here - for example a hotel. Just as you wouldn't expect your front-desk concierge to also cook your dinner and to make your bed, in agentic design, it's better to have dedicated agents, each tailored to a specific domain (e.g., hotels, restaurants, flights).

This is where multi-agent systems shine. They're about collaboration, context-passing, and controlled handoffs.

Core Principles of Multi-Agent Design

Domain-specific roles: Each agent should have a clearly scoped responsibility.
Explicit handoff mechanisms: Agents must know when to delegate and how to format the context for the next agent.
Shared memory or minimal context passing: Context that persists across agents (e.g., user location, preferences) must be passed or shared carefully.
Autonomy within limits: While agents are autonomous, they operate under fixed schema constraints to ensure reliability.

A Realistic Scenario: Book Me Something

Let's break this down with a simple, focused user goal:

"Can you book me a hotel for tonight for 2 people?”

But then... the user adds:

"Oh and I'd love a nearby Italian place to eat.”

Here, a generic model might scramble to handle both, possibly hallucinating a restaurant inside the hotel (even if it doesn't exist). A better design? Delegate hotel booking to one agent, restaurant booking to another.

And most importantly, know when to switch.

The Handoff Pattern

At the core of this approach is an AI-native version of a call centre operator saying (this sentence is often dreaded in real life, hopefully when you design an AI agent, the job it does will be better):

"Let me transfer you to someone who can better help with that.”

This is structured delegation: one agent recognises a task outside its remit, packages context cleanly, and passes it along to another.

In our implementation, this is done via a schema with two keys:

message: The agent's reply to the user.
handoff: If needed, the name of the next agent (or "" to stay put).

This ensures clarity in both user communication and system routing logic.

Design Decisions

We're using the Gemini Flash model for speed, and running two chat instances:

One for the Hotel Agent
One for the Restaurant Agent

Each has its own system prompt, its own memory, and adheres to the same schema. If the Hotel Agent detects a restaurant query, it doesn't fumble, it offloads.

The orchestration logic is minimal and elegant:

Maintain current agent.
If handoff occurs, swap agent chat instance.
Loop until the conversation ends.

This lets you simulate complex interaction paths without needing a heavy orchestrator.

Code Walkthrough

Let's take a look at the code for the Hotel Agent and Restaurant Agent. This is going to be a sample project, in real-life there would have to be extra work done by way of function calling to allow these agents to access hotel and restaurant APIs, check calendars so on and forth.

Initial Setup

import { GoogleGenAI, Type } from '@google/genai';
import readline from 'readline';

We use:

@google/genai: The Gemini SDK to create AI chat agents.
readline: Node.js module for terminal-based user input.

Client & Schema Definition

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const responseSchema = {
  type: Type.OBJECT,
  properties: {
    handoff: {
      type: Type.STRING,
      description: "The name/role of the agent to hand off to. Available agents: 'Restaurant Agent', 'Hotel Agent'",
      default: '',
    },
    message: {
      type: Type.STRING,
      description: 'The response message to the user or context for the next agent',
    },
  },
  required: ['message'],
};

This schema ensures a consistent response structure from all agents, enabling structured delegation logic. handoff can be blank or contain the next agent’s name.

Agent Prompts

const hotelSystemPrompt = `You are a Hotel Booking Agent...`;
const restaurantSystemPrompt = `You are a Restaurant Booking Agent...`;

Each prompt sets strict role boundaries:

The Hotel Agent only handles hotel-related requests.
The Restaurant Agent handles food bookings and recommendations.

User Input Helper

function getUserInput(question: string): Promise<string> {
  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
  });
  return new Promise((resolve) =>
    rl.question(question, (input) => {
      rl.close();
      resolve(input);
    })
  );
}

Main Flow

const hotelChat = ai.chats.create({
  model,
  history: [],
  config: {
    responseMimeType: 'application/json',
    responseSchema,
    systemInstruction: `You are Hotel Agent. ${hotelSystemPrompt}
    Current user context: ${JSON.stringify(sharedContext)}`,
  },
});

let chat = hotelChat;

function getAgentChat(agent: string) {
  return ai.chats.create({
    model,
    history: [],
    config: {
      responseMimeType: 'application/json',
      responseSchema,
      systemInstruction:
        agent === 'Hotel Agent'
          ? `You are Hotel Agent. ${hotelSystemPrompt}
             Current user context: ${JSON.stringify(sharedContext)}`
          : `You are Restaurant Agent. ${restaurantSystemPrompt}
             Current user context: ${JSON.stringify(sharedContext)}`,
    },
  });
}

while (true) {
  await extractContextFromInput(userInput);

  const result = await chat.sendMessage({ message: userInput });

  const raw = result.text;
  let output: { message: any; handoff: any; };
  try {
    output = JSON.parse(raw!);
  } catch (e) {
    console.error('❌ Failed to parse structured response:', raw);
    output = { message: '[Malformed response]', handoff: '' };
  }

  console.log(`\n🤖 ${currentAgent}:\n${output.message}`);

  if (output.handoff && output.handoff !== currentAgent) {
    console.log(`🔁 Handoff Triggered: ${currentAgent} ➡️ ${output.handoff}`);
    currentAgent = output.handoff;
    chat = getAgentChat(currentAgent);

    continue;
  }

  const followUp = await getUserInput('\n👤 You: ');
  if (followUp.toLowerCase() === 'exit') {
    console.log('👋 Conversation ended.');
    break;
  }

  userInput = followUp;
}

Kindly note the following in the code above we have individual chat instances with own system prompts and a shared response schema.

The magic really happens inside the while loop, which:

Sends userInput to the active agent.
Parses JSON output based on the schema.
Prints the agent's reply.
Checks if a handoff was triggered:
- If so, swaps to the appropriate agent and shares the context.
Prompts user for next input.
Ends if user types exit.

Also notice that we are doing some extraction for the arguments to be passed between the agents. This is done via tool calling:

const extractorSchema = {
  type: Type.OBJECT,
  properties: {
    city: {
      type: Type.STRING,
      description: 'Destination city or town (e.g., London, Tokyo, Rome)',
    },
    people: {
      type: Type.NUMBER,
      description: 'Number of people in the request (e.g., 2)',
    },
    hotelBudget: {
      type: Type.STRING,
      description: 'Hotel price or budget mentioned (e.g., "$200", "under $500")',
    },
    cuisine: {
      type: Type.STRING,
      description:
        'Preferred cuisine type (e.g., Italian, Indian, Japanese, Vegan). Do not use generic terms like "restaurant" or "food". Leave blank if not mentioned.',
    },
    date: {
      type: Type.STRING,
      description: 'When the booking is for (e.g., "tonight", "tomorrow", "2025-06-05")',
    },
  },
};


async function extractContextFromInput(input: string): Promise<void> {
  const result = await ai.models.generateContent({
    model,
    contents: [{ role: 'user', parts: [{ text: input }] }],
    config: {
      tools: [
        {
          functionDeclarations: [
            {
              name: 'extractContext',
              description: 'Extract travel-related context fields',
              parameters: extractorSchema,
            },
          ],
        },
      ],
      toolConfig: {
        functionCallingConfig: {
          mode: FunctionCallingConfigMode.ANY
        }
      }
    },
  });

  try {
    const call = result.candidates?.[0]?.content?.parts?.[0]?.functionCall;
    if (call?.name === 'extractContext' && call.args) {
      let extractedArgs: Record<string, any> = {};
      try {
        extractedArgs = call.args
      } catch (e) {
        console.warn('⚠️ Failed to parse function call args:', call.args);
      }

      for (const [key, value] of Object.entries(extractedArgs)) {
        if (sharedContext[key]) continue; // ✅ Skip if already set
        sharedContext[key] = value;
      }

    } else {
      console.warn('⚠️ No function call was made or it was the wrong one:', call?.name);
    }
  } catch (e) {
    console.error('❌ Failed to extract context from result:', e);
  }
}

const sharedContext: Record<string, any> = {};

Expand here for an example output
📝 Prompt:
Initial User Request: Can you book me a hotel for tonight for 2 people?

🤖 Hotel Agent:
Sure, I can help with that. Which city are you looking to book a hotel in?

👤 You: London

🤖 Hotel Agent:
OK. I will find a hotel in London for tonight for 2 people. What is your budget?

👤 You: £350 per night per person

🤖 Hotel Agent:
OK. I will find a hotel in London for tonight for 2 people with a budget of £350 per night per person. Do you have any preferences?

👤 You: 5* please

🤖 Hotel Agent:
OK. I will find a 5* hotel in London for tonight for 2 people with a budget of £350 per night per person. Do you have any other preferences?

👤 You: I also need a restaurant for 2 people

🤖 Hotel Agent:
I can only help with hotels. I will find a 5* hotel in London for tonight for 2 people with a budget of £350 per night per person. For the restaurant request, I will hand you off to the Restaurant Agent.
🔁 Handoff Triggered: Hotel Agent ➡️ Restaurant Agent

🤖 Restaurant Agent:
OK. I am on it. Which cuisine would you prefer?

👤 You: What cuisines are good in this city?

🤖 Restaurant Agent:
London has a diverse culinary scene. Popular cuisines include British, Indian, Italian, Chinese, and Japanese. Do any of these appeal to you?

👤 You: let's go with indian

🤖 Restaurant Agent:
Okay, I will look for an Indian restaurant for tonight for 2 people in London. Do you have a price range in mind?

👤 You: £50 pp

🤖 Restaurant Agent:
Okay, searching for an Indian restaurant with a price range of £50 per person.

Benefits of Multi-Agent Architectures

There are multiple benefits of using a multi-agent architecture, because specialisation leads to fewer hallucinations, it's easier to test and debug agent behaviour (in isolation) and schema enformecement is very useful for standardised handoffs. As your system grows, you might imagine even more agents: "Flight Agent", "Event Agent", "Transit Planner". And a future orchestration layer might dynamically discover which agents are needed.

It's also important to note that the handoff task is key so sharing the appropriate context is ever so crucial.

Conclusion

Multi-agent systems represent more than just a technical trick, they're a paradigm shift in how we architect AI workflows. Instead of asking a single model to do everything, we delegate tasks with purpose: We move from monolithic LLM calls to composable AI ecosystems, where each agent plays a well-defined role.

In this example, a hotel agent knew its limits and passed the baton to a restaurant agent. In the real world, these patterns scale to complex tasks: planning software launches, coordinating logistics, even simulating multi-stakeholder negotiations. And because we lean on structured schemas and specialised roles, we gain predictability, scalability, and transparency.

As you design your own agentic systems, remember: it is not about making one perfect prompt. It is about composing a system that can gracefully handle complexity, delegate with clarity, and adapt as your use case grows.

Thanks for following this series on Agentic AI. Whether you're experimenting with single-agent reflection loops or building multi-agent orchestration networks, you're helping shape the future of how AI works with us - not just for us.