Skip to main content

Writing

Notes on AI, agents, web technologies, and building with LLMs

Technical articles on AI integration, agents, agent harnesses, MCP, web technologies, and emerging tools. Working notes from a Google Developer Expert with 25+ years of building and teaching.

  1. Why How You Split Your Documents Matters More Than You Think

    Before you reach for a more powerful embedding model or a larger context window, look at what you're actually feeding into a RAG pipeline. Sometimes the highest-leverage improvement isn't a better model but rather it's a better split.

  2. Filesystem as Context: Building an AI Detective with bash-tool

    Instead of stuffing documents into prompts, give your AI agent a filesystem and let it retrieve its own context. Here's how, using a murder mystery detective as the demo.

  3. The Mind's Eye - Engineering a CLI for Intelligent AI Interaction

    This article concludes the series by showing how a deliberately designed CLI becomes a powerful interaction layer, giving users precise control over an AI system's conversational context, short-term memory, and long-term semantic knowledge.

  4. The Autonomous Brain - Engineering AI for Continuous Learning and Memory Enrichment

    This piece introduces background processors as autonomous AI agents that summarise conversations and extract critical facts to continuously enrich Long-Term Semantic Memory. By running asynchronously and optimising token usage, these processors enable a self-improving, increasingly personalised AI system that learns from every interaction.

  5. The 'Aha!' Moment - Engineering the Perfect Prompt for Truly Contextual AI

    In this article, we show how dynamic prompt engineering—via a `SessionManager` that intelligently layers short-term context, long-term semantic memory, and system instructions turns stateless LLM calls into genuinely contextual and personalised conversations.

  6. Semantic Horizons - Engineering an AI's Enduring Long-Term Memory

    In this article, we explain how Long-Term Semantic Memory uses vector embeddings and semantic search to give AI meaningful, persistent memory across conversations.

  7. Building AI Agents with Google ADK: A Practical Guide

    Learn how to build multi-agent systems with vector search, tool orchestration, and semantic understanding using Google's Agent Development Kit (JS/TS version).

  8. The Token Economy - Engineering an AI's Working Memory

    In this article, we explore how Short-Term Conversational Memory creates the illusion of memory in otherwise stateless LLMs through careful context persistence and structured prompt reconstruction. We also show how token limits, cost, and context degradation are managed using asynchronous, AI-driven summarisation that preserves meaning while keeping conversations efficient and coherent.

  9. The Grand Orchestration - Engineering a Dual-Memory AI for Enduring Conversations

    This article explains why LLMs often forget earlier messages and how naive full-history prompting is costly and inefficient. It introduces a dual-memory architecture: a short-term store for immediate conversation flow and a long-term semantic store for durable knowledge across sessions. Together, these systems let an AI maintain coherent dialogue without overloading the context window or budget.

  10. MCP Workshop Answers for DevFest Taipei 2025

    These are the answers to the questions asked via Slido during my workshop on MCP at DevFest Taipei 2025.