Filesystem as Context: Building an AI Detective with bash-tool
Instead of stuffing documents into prompts, give your AI agent a filesystem and let it retrieve its own context. Here's how, using a murder mystery detective as the demo.
Instead of stuffing documents into prompts, give your AI agent a filesystem and let it retrieve its own context. Here's how, using a murder mystery detective as the demo.
This article concludes the series by showing how a deliberately designed CLI becomes a powerful interaction layer, giving users precise control over an AI system’s conversational context, short-term memory, and long-term semantic knowledge.
This piece introduces background processors as autonomous AI agents that summarise conversations and extract critical facts to continuously enrich Long-Term Semantic Memory. By running asynchronously and optimising token usage, these processors enable a self-improving, increasingly personalised AI system that learns from every interaction.
In this article, we show how dynamic prompt engineering—via a `SessionManager` that intelligently layers short-term context, long-term semantic memory, and system instructions turns stateless LLM calls into genuinely contextual and personalised conversations.
In this article, we explain how Long-Term Semantic Memory uses vector embeddings and semantic search to give AI meaningful, persistent memory across conversations.
Learn how to build multi-agent systems with vector search, tool orchestration, and semantic understanding using Google's Agent Development Kit (JS/TS version).
In this article, we explore how Short-Term Conversational Memory creates the illusion of memory in otherwise stateless LLMs through careful context persistence and structured prompt reconstruction. We also show how token limits, cost, and context degradation are managed using asynchronous, AI-driven summarisation that preserves meaning while keeping conversations efficient and coherent.
This article explains why LLMs often “forget” earlier messages and how naive full-history prompting is costly and inefficient. It introduces a dual-memory architecture: a short-term store for immediate conversation flow and a long-term semantic store for durable knowledge across sessions. Together, these systems let an AI maintain coherent dialogue without overloading the model’s context window or budget.
A practical look at why hallucinations occur in modern language models, why current evaluation methods make them persist, and how to detect them using Natural Language Inference in JavaScript.
This article explores how clearly defined functions enable large language models to make accurate tool calls, emphasising the importance of precision and developer intent in the function calling process.
In this article, we walk through the process of building a Model Context Protocol (MCP) client. Learn how to connect to servers, discover tools, and invoke them from your own app or LLM integration.
Model Context Protocol (MCP) servers expose tools, resources, and prompts to LLMs in a unified, structured way. This post explores how they work, how to build one, and why they are a critical part of the future AI stack.
A hands-on guide for exploring how to train a simple AI model using TensorFlow.js to inpaint missing parts of images - without needing large datasets or prior machine learning experience.
A hands-on walkthrough for web developers to demystify large language models by actually building a mini Transformer from scratch.