The 'Aha!' Moment - Engineering the Perfect Prompt for Truly Contextual AI

We’ve built two memory systems so far: a lean, token-efficient Short-Term Conversational Memory (STCM) for the immediate chat, and a vast, semantically searchable Long-Term Semantic Memory (LTSM) for enduring knowledge across all interactions. The AI now has both a quick notepad and a sophisticated library.

Having these memories isn’t enough, though. The real breakthrough lies in how we combine and present the right pieces of information to the LLM at the right time. That’s dynamic context/prompt engineering, and it’s the heart of genuinely contextual, personalised conversations.

The SessionManager is the central orchestrator here. It ensures our LLM receives the perfect briefing for every single turn.

The Art of the Prompt: Beyond Just Asking

LLMs are stateless (as we covered in Part 1). Every API call is a fresh start. For the AI to appear “smart” and “conversational,” we have to provide all relevant context within each prompt. It’s not just about asking a question; it’s about crafting an entire informational scaffold around that question.

The SessionManager acts as the dedicated prompt engineer. For every user message, before it even touches the main LLM, the SessionManager gets to work assembling a rich, multi-layered prompt. Its goal: give the LLM not just the user’s latest query, but a carefully curated collection of background information, past interactions, and enduring knowledge.

The `SessionManager`’s Recipe: Building the Contextual Prompt

Let’s pull apart the SessionManager’s getSession method. This is where the various ingredients get bolted together into the final contextual prompt.

// session/sessionManager.js

const SYSTEM_PROMPT = `You are a helpful and friendly AI assistant. Your goal is to provide insightful and accurate information while maintaining a pleasant and engaging tone.`;

export class SessionManager {
  constructor(shortTermMemory, longTermMemory) {
    this.shortTermMemory = shortTermMemory;
    this.longTermMemory = longTermMemory;
  }

  /**
   * Assembles the full conversational context for the LLM.
   * This includes the system prompt, short-term history (messages + summaries),
   * and relevant long-term memories.
   * @param {string} conversationId - The ID of the current conversation.
   * @param {string} latestUserMessage - The user's most recent input.
   * @returns {Promise<{session: Session, isNewConversation: boolean}>}
   */
  async getSession(conversationId, latestUserMessage) {
    let session = new Session(conversationId); // Helper class to manage prompt history array

    // 1. Initialise with the foundational system prompt.
    session.addSystemMessage(SYSTEM_PROMPT);

    // 2. Load short-term history (summaries and recent messages).
    const summaries = await this.shortTermMemory.getSummaries(conversationId);
    summaries.forEach(s => session.addSummary(s.summary)); // Summaries are added as 'system' or 'assistant' roles for context

    const recentMessages = await this.shortTermMemory.getRecentMessages(conversationId);
    recentMessages.forEach(m => session.addMessage(m.role, m.content));

    // 3. Critically: Enhance the session with cross-conversation memories.
    await this.enhanceWithCrossConversationMemories(session, latestUserMessage);

    return { session, isNewConversation: recentMessages.length === 0 && summaries.length === 0 };
  }

  /**
   * Fetches and injects semantically relevant memories from the LTSM into the prompt.
   * @param {Session} session - The current session object.
   * @param {string} latestUserMessage - The user's most recent input, used for semantic search.
   */
  async enhanceWithCrossConversationMemories(session, latestUserMessage) {
    if (!latestUserMessage) return; // Only search if there's a new message to use as a query.

    // Use the latest user input as the query for semantic search in long-term memory.
    const relevantContext = await this.longTermMemory.getRelevantContext(latestUserMessage, 5);

    if (relevantContext.length > 0) {
      // Structure the retrieved context clearly for the LLM.
      const contextHeader = 'For additional context, here is some relevant information from our past conversations:';
      const contextContent = relevantContext.map(c => `- ${c.text}`).join('\n');

      // Inject this relevant long-term context *early* in the prompt, after the system prompt.
      // This is a strategic placement to ensure the LLM prioritises this crucial information.
      session.addCrossConversationContext(contextHeader, contextContent);
    }
  }
}

The Layers of Contextual Prompting

The SessionManager constructs prompts through layered context engineering. Three distinct layers, each doing different work:

The Foundation: System Prompt (session.addSystemMessage(SYSTEM_PROMPT)) The AI’s identity and core instructions. It’s the immutable “north star” guiding persona and behaviour. Placing it first matters because LLMs tend to give more weight to instructions at the beginning of the prompt.
The Recent Past: Short-Term History (summaries & recentMessages) This layer carries the immediate conversational flow. We mix summaries (token-efficient abstractions from older parts of the conversation) with raw recentMessages. That maintains coherence without slamming into the context window wall. The role field from our SQLite database is critical here, ensuring the LLM correctly interprets who said what.
The Enduring Past: Cross-Conversational Relevance (enhanceWithCrossConversationMemories) Here’s the real breakthrough. The latestUserMessage isn’t just passed to the LLM; it’s first used as a semantic query into our LTSM. The SessionManager asks: “Given what the user just said, what permanent knowledge from any past conversation is most relevant right now?”

The longTermMemory.getRelevantContext() method (covered in Part 3) uses vector embeddings to retrieve semantically similar summaries and critical facts.

Strategic Injection: Notice session.addCrossConversationContext(contextHeader, contextContent);. The Session object injects this information after the initial system prompt but before the immediate conversation history. The LLM processes this highly relevant background context early, framing its understanding of the current dialogue with enduring knowledge. That placement is a key aspect of advanced prompt engineering.

The Final Prompt

Consider what the LLM actually receives for a simple query like, “How’s that project going?”

[
  { "role": "system", "content": "You are a helpful and friendly AI assistant. Your goal is to provide insightful and accurate information while maintaining a pleasant and engaging tone." },
  { "role": "system", "content": "For additional context, here is some relevant information from our past conversations:" },
  { "role": "system", "content": "- You previously discussed a 'Project Chimera' with user, which is a software development initiative." },
  { "role": "system", "content": "- User mentioned the 'Project Chimera' faces a deadline on October 27th." },
  { "role": "assistant", "content": "Summary of previous discussion: User expressed concerns about team resources for 'Project Chimera'." },
  { "role": "user", "content": "Last week: Any progress on that? My team is stretched." },
  { "role": "assistant", "content": "Last week: We've allocated more resources. The new deadline is tight but achievable." },
  { "role": "user", "content": "How's that project going?" }
]

Without the SessionManager’s intelligent construction, the LLM would only see the last few messages and likely respond with a generic “Which project?” With the SessionManager orchestrating things, the AI immediately knows the user is referring to “Project Chimera” (even if “Chimera” isn’t in the immediate chat history) and can respond with specific, relevant updates.

That’s the difference between a chatbot and a genuinely intelligent conversational agent.

$

This dynamic layering of context (system instructions, enduring knowledge, immediate conversational flow) is what makes Context Engineering work. The AI maintains deep understanding without blowing the token budget.

What’s Next? The Autonomous Evolution

Our AI now uses its memories. But how does that vast library of long-term knowledge get filled with those crucial summaries and critical facts in the first place? How does the AI learn about the user without explicit instruction?

In the next instalment, we’ll look at the AI’s internal scholars: the autonomous background processors. These agents work behind the scenes, continually enriching the AI’s knowledge base, turning raw conversation into structured, reusable wisdom.

The Art of the Prompt: Beyond Just Asking

The SessionManager’s Recipe: Building the Contextual Prompt

The Layers of Contextual Prompting

The Final Prompt

What’s Next? The Autonomous Evolution

The `SessionManager`’s Recipe: Building the Contextual Prompt