Skip to main content
intermediate Featured
Difficulty: 3/5
Published: 1/17/2025
By: UnlockMCP Team

Ditching RAG: Building a Local MCP Server for Your Docs

Learn how to build a lightweight MCP server that provides direct access to local documentation files, offering a simpler alternative to complex RAG pipelines for project-specific context.

What You'll Learn

  • How to create a local MCP server using FastMCP
  • Debugging common MCP server deployment issues
  • +2 more

Time & Difficulty

Time: 45 minutes

Level: Intermediate

What You'll Need

  • Python 3.11+ installed
  • A local markdown documentation file
  • +2 more

Prerequisites

  • Basic Python knowledge
  • Understanding of command line operations
  • Familiarity with JSON and configuration files
mcp documentation local-server rag-alternative python fastmcp

Ditching RAG: Building a Local MCP Server for Your Docs

In the world of LLMs, Retrieval-Augmented Generation (RAG) is king. It’s the standard way to provide external knowledge to a model, usually involving vector databases, embedding models, and complex data pipelines. But what if you just want to give an LLM access to a single, local markdown file—like your project’s documentation?

Setting up a full RAG pipeline for this feels like using a sledgehammer to crack a nut. It’s complex, can be slow, and might involve API costs for embeddings.

This is where the Model Context Protocol (MCP) shines. We can build a simple, lightning-fast MCP server that directly reads a local file and exposes its contents as tools. It’s a powerful, lightweight alternative for project-specific context.

This post chronicles the journey of building such a server. We’ll walk through the initial idea, the errors we hit, the “aha!” moments, and the final, robust solution that works for both graphical clients like Claude Desktop and command-line tools like cline.

The Goal: A Simple Document Server

Our mission is to create an MCP server that can:

  1. Read a single markdown file, context.md.
  2. Parse it into sections based on top-level headers (# Title).
  3. Expose two tools to an LLM:
    • get_context_overview(): Lists all section titles.
    • search_context(query): Searches the content of all sections for a query.

This server will live inside an existing project (in my case, an AstroJS website), but the principles apply anywhere.

The First Attempt: The Low-Level Approach

My first instinct was to handle the MCP protocol manually. I wrote a Python script that listened on stdin, parsed JSON-RPC messages, and constructed JSON-RPC responses. It looked something like this (this is a simplified version of my first attempt):

# mcp_context_server_fixed.py - A flawed first attempt

import asyncio
import json
import sys

# ... a bunch of manual parsing and logic ...

class MCPContextServer:
    # ... lots of code to handle "initialize", "tools/list", "tools/call" ...
    async def handle_message(self, message: dict) -> Optional[dict]:
        # ... more manual JSON handling ...
        pass

    async def run(self):
        while True:
            line = sys.stdin.readline()
            message = json.loads(line)
            response = await self.handle_message(message)
            print(json.dumps(response), flush=True)

# ... main execution logic ...

I tried to install this in Claude Desktop and immediately hit my first wall.

Roadblock #1: The Dreaded spawn python ENOENT

After trying to install the server, the tool logs showed a cryptic but critical error:

[error] spawn python ENOENT {
  metadata: {
    context: 'connection',
    stack: 'Error: spawn python ENOENT...'
  }
}

Let’s break this down:

  • spawn: The MCP client (Claude Desktop) was trying to start my script as a new process.
  • python: The command it was trying to run was literally python.
  • ENOENT: A standard system error meaning “Error: No Such File or Directory.”

The problem was that my system (macOS) doesn’t have a python command. The command is python3 or, even more specifically, python3.11. The MCP client was using the wrong command.

My first misconception was thinking I could fix this with a manual mcp.json file. But mcp install doesn’t use that file; it generates its own configuration.

The Solution for Claude Desktop: The Shell Wrapper

The most robust way to solve this for GUI clients like Claude Desktop is to tell them exactly how to run the script using a shell wrapper.

  1. Create run_context_server.sh: A simple bash script that calls our Python script with the correct interpreter. The path to your Python version might vary. You can find it with which python3.11.
#!/bin/bash
# Executes the MCP server script with the correct Homebrew Python interpreter.

# This finds the directory the script itself is in, and then
# runs the python script from that same directory.
/opt/homebrew/bin/python3.11 "$(dirname "$0")/mcp_context_server.py"
  1. Make it executable:
chmod +x scripts/run_context_server.sh
  1. Install the wrapper script: Instead of installing the .py file, we install the .sh file.
mcp install scripts/run_context_server.sh --name "context-server"

This tells Claude Desktop, “Don’t guess. To run this tool, just execute this script.” Problem solved.

Refining the Server with FastMCP

While the wrapper fixed the execution, my manual server code was brittle and complex. The MCP Python SDK offers a much better way: FastMCP. It handles all the protocol boilerplate, letting you focus on your tool’s logic.

Here is the final, vastly improved server code using FastMCP:

# scripts/mcp_context_server.py

import sys
import re
from pathlib import Path
from typing import Any, Dict

try:
    from mcp.server.fastmcp import FastMCP
except ImportError:
    print(
        "FATAL ERROR: 'mcp' library not found.",
        file=sys.stderr
    )
    sys.exit(1)

CONTEXT_FILE = Path(__file__).parent.parent / "context.md"

mcp = FastMCP(
    "context-server",
    description="A server to access and search project documentation."
)

class ContextParser:
    # ... (Full parsing logic as shown in the final code from our chat) ...
    # It reads the file, splits by headers, and provides search/overview methods.
    # We added print(..., file=sys.stderr) for logging, which is crucial for debugging.
    pass # The full class code is in the final version above.

context_parser = ContextParser(CONTEXT_FILE)

@mcp.tool()
async def search_context(query: str) -> str:
    """Search through the documentation for specific topics or keywords."""
    return context_parser.search(query)

@mcp.tool()
async def get_context_overview() -> str:
    """List all available top-level sections in the documentation."""
    return context_parser.get_overview()

if __name__ == "__main__":
    print("Starting MCP context server...", file=sys.stderr)
    mcp.run()

This version is cleaner, more robust, and easier to maintain.

Roadblock #2: A Simple Dependency Issue

With the new code, I ran mcp install again and hit a new error.

Error: typer is required. Install with 'pip install mcp[cli]'

This was a simple but important lesson. The base mcp package doesn’t include the dependencies for its command-line tools (like mcp install). To get them, you need to install the package with the [cli] extra.

The fix was to update requirements.txt:

Before:

mcp>=1.2.0

After:

mcp[cli]>=1.2.0

Then, re-installing with pip install -r requirements.txt solved it immediately.

The “Aha!” Moment: One Server, Two Workflows

Everything worked for Claude Desktop. But I also use cline, a command-line tool that manages its servers with a single, large JSON file. How could I add my new server there?

This revealed the final piece of the puzzle: different MCP clients can have different configuration methods.

Workflow A: Claude Desktop (and mcp dev)

  • How it works: Uses the mcp install command, which is designed to be simple and user-friendly.
  • The key: The .sh wrapper script is the perfect solution here. It abstracts away the command details.
  • For development: The mcp dev command is your best friend. It launches the server and a web-based “Inspector” for easy testing, all from your VS Code terminal.
mcp dev scripts/run_context_server.sh

Workflow B: cline and other CLI tools

  • How it works: Relies on a manual JSON configuration file that explicitly defines how to run every server.
  • The key: This is where you directly solve the ENOENT problem in configuration. You don’t need the .sh wrapper here.

I opened my cline configuration file and added a new entry for my server. This is where my initial mcp.json idea finally found its rightful home.

{
  "mcpServers": {
    "local-context": {
      "timeout": 60,
      "type": "stdio",
      "command": "/opt/homebrew/bin/python3.11",
      "args": [
        "/Users/louiserobertson/Documents/Code/unlockmcp-website/scripts/mcp_context_server.py"
      ],
      "env": {}
    }
  }
}

This configuration explicitly tells cline:

  1. Use this exact Python executable (/opt/homebrew/bin/python3.11).
  2. Pass the absolute path to the script as an argument.

With this addition, my server was now available in cline alongside all my other tools.

Final Thoughts

Building this simple server was an incredibly insightful exercise. What started as a “simple RAG alternative” became a deep dive into the practical realities of developing and deploying MCP tools.

The key takeaways are:

  1. Anticipate the ENOENT error: Always be explicit about which Python interpreter to use, either with a wrapper script (mcp install) or direct configuration (cline).
  2. Use FastMCP: Don’t reinvent the wheel. The high-level SDK is robust and simplifies development immensely.
  3. Log to stderr: When your server runs headless, print(..., file=sys.stderr) is your lifeline for debugging.
  4. Know your client: The way you configure a tool for Claude Desktop can be different from how you configure it for a command-line utility.

In the end, we have a fantastic, zero-dependency retrieval tool that’s faster and simpler than any RAG pipeline for this use case. It’s a testament to the power and flexibility of the Model Context Protocol.

Get the Complete Code

The full implementation of this MCP docs server is available on GitHub: unlock-mcp/mcp-docs-server

The repository includes:

  • Complete server implementation with FastMCP
  • Shell wrapper scripts for cross-platform compatibility
  • Example documentation file
  • Installation and configuration instructions
  • Troubleshooting guide

Clone it and start using it with your own documentation right away!

Related Guides

Want More Step-by-Step Guides?

Get weekly implementation guides and practical MCP tutorials delivered to your inbox.

Subscribe for Weekly Guides