Skip to main content

The 'Aha!' Moment - Engineering the Perfect Prompt for Truly Contextual AI

10 min read

Welcome back! So far, we’ve meticulously crafted two powerful memory systems: our lean, token-efficient Short-Term Conversational Memory (STCM) for the immediate chat, and our vast, semantically searchable Long-Term Semantic Memory (LTSM) for enduring knowledge across all interactions. We’ve equipped our AI with both a quick notepad and a sophisticated library of knowledge.

But here’s the crucial juncture: merely having these memories isn’t enough. The true genius, the “Aha!” moment in our Context Engineering journey, lies in how we intelligently combine and present the right pieces of information from these memories to the LLM at precisely the right time. This is the domain of dynamic context/prompt engineering, and it’s the heart of our AI’s ability to maintain genuinely contextual and personalised conversations.

Today, we are going to look at the SessionManager, the central orchestrator that ensures our LLM receives the perfect briefing for every single turn.

The Art of the Prompt: Beyond Just Asking

Remember from Part 1 that LLMs are stateless. Every single API call is a fresh start. For the AI to appear “smart” and “conversational,” we have to painstakingly provide all relevant context within each prompt. This isn’t just about asking a question; it’s about crafting an entire narrative and informational scaffold around that question.

The SessionManager is the dedicated prompt engineer. For every user message, before it even touches the main LLM, the SessionManager gets to work, meticulously assembling a rich, multi-layered prompt. Its goal: to give the LLM not just the user’s latest query, but also a carefully curated collection of background information, past interactions, and enduring knowledge.

The SessionManager’s Recipe: Building the Contextual Prompt

Let’s dissect the SessionManager’s getSession method. This is where the magic happens, where various ingredients are brought together to form the ultimate contextual prompt.

// session/sessionManager.js

const SYSTEM_PROMPT = `You are a helpful and friendly AI assistant. Your goal is to provide insightful and accurate information while maintaining a pleasant and engaging tone.`;

export class SessionManager {
  constructor(shortTermMemory, longTermMemory) {
    this.shortTermMemory = shortTermMemory;
    this.longTermMemory = longTermMemory;
  }

  /**
   * Assembles the full conversational context for the LLM.
   * This includes the system prompt, short-term history (messages + summaries),
   * and relevant long-term memories.
   * @param {string} conversationId - The ID of the current conversation.
   * @param {string} latestUserMessage - The user's most recent input.
   * @returns {Promise<{session: Session, isNewConversation: boolean}>}
   */
  async getSession(conversationId, latestUserMessage) {
    let session = new Session(conversationId); // Helper class to manage prompt history array

    // 1. Initialise with the foundational system prompt.
    session.addSystemMessage(SYSTEM_PROMPT);

    // 2. Load short-term history (summaries and recent messages).
    const summaries = await this.shortTermMemory.getSummaries(conversationId);
    summaries.forEach(s => session.addSummary(s.summary)); // Summaries are added as 'system' or 'assistant' roles for context

    const recentMessages = await this.shortTermMemory.getRecentMessages(conversationId);
    recentMessages.forEach(m => session.addMessage(m.role, m.content));

    // 3. Critically: Enhance the session with cross-conversation memories.
    await this.enhanceWithCrossConversationMemories(session, latestUserMessage);

    return { session, isNewConversation: recentMessages.length === 0 && summaries.length === 0 };
  }

  /**
   * Fetches and injects semantically relevant memories from the LTSM into the prompt.
   * @param {Session} session - The current session object.
   * @param {string} latestUserMessage - The user's most recent input, used for semantic search.
   */
  async enhanceWithCrossConversationMemories(session, latestUserMessage) {
    if (!latestUserMessage) return; // Only search if there's a new message to use as a query.

    // Use the latest user input as the query for semantic search in long-term memory.
    const relevantContext = await this.longTermMemory.getRelevantContext(latestUserMessage, 5);

    if (relevantContext.length > 0) {
      // Structure the retrieved context clearly for the LLM.
      const contextHeader = 'For additional context, here is some relevant information from our past conversations:';
      const contextContent = relevantContext.map(c => `- ${c.text}`).join('\n');
      
      // Inject this relevant long-term context *early* in the prompt, after the system prompt.
      // This is a strategic placement to ensure the LLM prioritises this crucial information.
      session.addCrossConversationContext(contextHeader, contextContent);
    }
  }
}

Deep Dive: The Layers of Contextual Prompting

Let’s break down the SessionManager’s approach to prompt construction, which is all about layered context engineering:

  1. The Foundation: System Prompt (session.addSystemMessage(SYSTEM_PROMPT))

    • This is the AI’s identity and core instructions. It’s the immutable “north star” that guides the AI’s persona and behaviour. Placing it first is critical because LLMs often give more weight to instructions at the beginning of the prompt. It sets the stage for every interaction.
  2. The Recent Past: Short-Term History (summaries & recentMessages)

    • This layer provides the immediate conversational flow. Importantly, we mix summaries (the token-efficient abstractions from older parts of the current conversation) with raw recentMessages. This maintains coherence without hitting the context window wall. The role field from our SQLite database is crucial here, ensuring the LLM correctly interprets who said what.
  3. The Enduring Past: Cross-Conversational Relevance (enhanceWithCrossConversationMemories)

    • This is the “Aha!” moment. The latestUserMessage isn’t just passed to the LLM; it’s first used as a semantic query into our LTSM. The SessionManager asks: “Given what the user just said, what permanent knowledge from any past conversation is most relevant right now?”
    • The longTermMemory.getRelevantContext() method, which we explored in Part 3, uses vector embeddings to retrieve semantically similar summaries and critical facts.
    • Strategic Injection: Notice session.addCrossConversationContext(contextHeader, contextContent);. This isn’t just appending to the end. The Session object is designed to inject this crucial information after the initial system prompt but before the immediate conversation history. This ensures the LLM processes this highly relevant background context early, allowing it to frame its understanding of the current dialogue with enduring knowledge. This prioritisation is a key aspect of advanced prompt engineering.

The Final Prompt

Consider the richness of the prompt the LLM receives for a user’s simple query like, “How’s that project going?”

[
  { "role": "system", "content": "You are a helpful and friendly AI assistant. Your goal is to provide insightful and accurate information while maintaining a pleasant and engaging tone." },
  { "role": "system", "content": "For additional context, here is some relevant information from our past conversations:" },
  { "role": "system", "content": "- You previously discussed a 'Project Chimera' with user, which is a software development initiative." },
  { "role": "system", "content": "- User mentioned the 'Project Chimera' faces a deadline on October 27th." },
  { "role": "assistant", "content": "Summary of previous discussion: User expressed concerns about team resources for 'Project Chimera'." },
  { "role": "user", "content": "Last week: Any progress on that? My team is stretched." },
  { "role": "assistant", "content": "Last week: We've allocated more resources. The new deadline is tight but achievable." },
  { "role": "user", "content": "How's that project going?" }
]

Without the SessionManager’s intelligent construction, the LLM might only see the last few messages, leading to a generic “Which project?” response. With the SessionManager’s orchestrating prowess, the AI immediately knows the user is referring to “Project Chimera,” even if “Chimera” isn’t in the immediate chat history, and can respond with specific, highly relevant updates. This is the difference between a simple chatbot and a truly intelligent conversational agent.

This dynamic layering of context - system instructions, enduring knowledge, and immediate conversational flow - is the pinnacle of our Context Engineering efforts, allowing our AI to maintain deep understanding without breaking the token bank or its own concentration.

What’s Next? The Autonomous Evolution

Our AI now uses its memories. But how does that vast library of long-term knowledge get filled with those crucial summaries and critical facts in the first place? And how does the AI learn about the user without explicit instruction?

In our next instalment, we’ll peel back the curtain on the AI’s internal scholars: the autonomous background processors. These intelligent agents work tirelessly behind the scenes, continually enriching the AI’s knowledge base, turning raw conversation into structured, reusable wisdom. Prepare to see how the system continuously evolves itself to become even smarter and more personalised.