Ditching RAG: Building a Local MCP Server for Your Docs
Learn how to build a lightweight MCP server that provides direct access to local documentation files, offering a simpler alternative to complex RAG pipelines for project-specific context.
What You'll Learn
- How to create a local MCP server using FastMCP
- Debugging common MCP server deployment issues
- +2 more
Time & Difficulty
Time: 45 minutes
Level: Intermediate
What You'll Need
- Python 3.11+ installed
- A local markdown documentation file
- +2 more
Prerequisites
- Basic Python knowledge
- Understanding of command line operations
- Familiarity with JSON and configuration files
Ditching RAG: Building a Local MCP Server for Your Docs
In the world of LLMs, Retrieval-Augmented Generation (RAG) is king. It’s the standard way to provide external knowledge to a model, usually involving vector databases, embedding models, and complex data pipelines. But what if you just want to give an LLM access to a single, local markdown file—like your project’s documentation?
Setting up a full RAG pipeline for this feels like using a sledgehammer to crack a nut. It’s complex, can be slow, and might involve API costs for embeddings.
This is where the Model Context Protocol (MCP) shines. We can build a simple, lightning-fast MCP server that directly reads a local file and exposes its contents as tools. It’s a powerful, lightweight alternative for project-specific context.
This post chronicles the journey of building such a server. We’ll walk through the initial idea, the errors we hit, the “aha!” moments, and the final, robust solution that works for both graphical clients like Claude Desktop and command-line tools like cline
.
The Goal: A Simple Document Server
Our mission is to create an MCP server that can:
- Read a single markdown file,
context.md
. - Parse it into sections based on top-level headers (
# Title
). - Expose two tools to an LLM:
get_context_overview()
: Lists all section titles.search_context(query)
: Searches the content of all sections for a query.
This server will live inside an existing project (in my case, an AstroJS website), but the principles apply anywhere.
The First Attempt: The Low-Level Approach
My first instinct was to handle the MCP protocol manually. I wrote a Python script that listened on stdin
, parsed JSON-RPC messages, and constructed JSON-RPC responses. It looked something like this (this is a simplified version of my first attempt):
# mcp_context_server_fixed.py - A flawed first attempt
import asyncio
import json
import sys
# ... a bunch of manual parsing and logic ...
class MCPContextServer:
# ... lots of code to handle "initialize", "tools/list", "tools/call" ...
async def handle_message(self, message: dict) -> Optional[dict]:
# ... more manual JSON handling ...
pass
async def run(self):
while True:
line = sys.stdin.readline()
message = json.loads(line)
response = await self.handle_message(message)
print(json.dumps(response), flush=True)
# ... main execution logic ...
I tried to install this in Claude Desktop and immediately hit my first wall.
Roadblock #1: The Dreaded spawn python ENOENT
After trying to install the server, the tool logs showed a cryptic but critical error:
[error] spawn python ENOENT {
metadata: {
context: 'connection',
stack: 'Error: spawn python ENOENT...'
}
}
Let’s break this down:
- spawn: The MCP client (Claude Desktop) was trying to start my script as a new process.
- python: The command it was trying to run was literally
python
. - ENOENT: A standard system error meaning “Error: No Such File or Directory.”
The problem was that my system (macOS) doesn’t have a python
command. The command is python3
or, even more specifically, python3.11
. The MCP client was using the wrong command.
My first misconception was thinking I could fix this with a manual mcp.json
file. But mcp install
doesn’t use that file; it generates its own configuration.
The Solution for Claude Desktop: The Shell Wrapper
The most robust way to solve this for GUI clients like Claude Desktop is to tell them exactly how to run the script using a shell wrapper.
- Create run_context_server.sh: A simple bash script that calls our Python script with the correct interpreter. The path to your Python version might vary. You can find it with
which python3.11
.
#!/bin/bash
# Executes the MCP server script with the correct Homebrew Python interpreter.
# This finds the directory the script itself is in, and then
# runs the python script from that same directory.
/opt/homebrew/bin/python3.11 "$(dirname "$0")/mcp_context_server.py"
- Make it executable:
chmod +x scripts/run_context_server.sh
- Install the wrapper script: Instead of installing the .py file, we install the .sh file.
mcp install scripts/run_context_server.sh --name "context-server"
This tells Claude Desktop, “Don’t guess. To run this tool, just execute this script.” Problem solved.
Refining the Server with FastMCP
While the wrapper fixed the execution, my manual server code was brittle and complex. The MCP Python SDK offers a much better way: FastMCP. It handles all the protocol boilerplate, letting you focus on your tool’s logic.
Here is the final, vastly improved server code using FastMCP:
# scripts/mcp_context_server.py
import sys
import re
from pathlib import Path
from typing import Any, Dict
try:
from mcp.server.fastmcp import FastMCP
except ImportError:
print(
"FATAL ERROR: 'mcp' library not found.",
file=sys.stderr
)
sys.exit(1)
CONTEXT_FILE = Path(__file__).parent.parent / "context.md"
mcp = FastMCP(
"context-server",
description="A server to access and search project documentation."
)
class ContextParser:
# ... (Full parsing logic as shown in the final code from our chat) ...
# It reads the file, splits by headers, and provides search/overview methods.
# We added print(..., file=sys.stderr) for logging, which is crucial for debugging.
pass # The full class code is in the final version above.
context_parser = ContextParser(CONTEXT_FILE)
@mcp.tool()
async def search_context(query: str) -> str:
"""Search through the documentation for specific topics or keywords."""
return context_parser.search(query)
@mcp.tool()
async def get_context_overview() -> str:
"""List all available top-level sections in the documentation."""
return context_parser.get_overview()
if __name__ == "__main__":
print("Starting MCP context server...", file=sys.stderr)
mcp.run()
This version is cleaner, more robust, and easier to maintain.
Roadblock #2: A Simple Dependency Issue
With the new code, I ran mcp install
again and hit a new error.
Error: typer is required. Install with 'pip install mcp[cli]'
This was a simple but important lesson. The base mcp
package doesn’t include the dependencies for its command-line tools (like mcp install
). To get them, you need to install the package with the [cli]
extra.
The fix was to update requirements.txt
:
Before:
mcp>=1.2.0
After:
mcp[cli]>=1.2.0
Then, re-installing with pip install -r requirements.txt
solved it immediately.
The “Aha!” Moment: One Server, Two Workflows
Everything worked for Claude Desktop. But I also use cline
, a command-line tool that manages its servers with a single, large JSON file. How could I add my new server there?
This revealed the final piece of the puzzle: different MCP clients can have different configuration methods.
Workflow A: Claude Desktop (and mcp dev)
- How it works: Uses the
mcp install
command, which is designed to be simple and user-friendly. - The key: The
.sh
wrapper script is the perfect solution here. It abstracts away the command details. - For development: The
mcp dev
command is your best friend. It launches the server and a web-based “Inspector” for easy testing, all from your VS Code terminal.
mcp dev scripts/run_context_server.sh
Workflow B: cline and other CLI tools
- How it works: Relies on a manual JSON configuration file that explicitly defines how to run every server.
- The key: This is where you directly solve the ENOENT problem in configuration. You don’t need the
.sh
wrapper here.
I opened my cline
configuration file and added a new entry for my server. This is where my initial mcp.json
idea finally found its rightful home.
{
"mcpServers": {
"local-context": {
"timeout": 60,
"type": "stdio",
"command": "/opt/homebrew/bin/python3.11",
"args": [
"/Users/louiserobertson/Documents/Code/unlockmcp-website/scripts/mcp_context_server.py"
],
"env": {}
}
}
}
This configuration explicitly tells cline
:
- Use this exact Python executable (
/opt/homebrew/bin/python3.11
). - Pass the absolute path to the script as an argument.
With this addition, my server was now available in cline
alongside all my other tools.
Final Thoughts
Building this simple server was an incredibly insightful exercise. What started as a “simple RAG alternative” became a deep dive into the practical realities of developing and deploying MCP tools.
The key takeaways are:
- Anticipate the ENOENT error: Always be explicit about which Python interpreter to use, either with a wrapper script (
mcp install
) or direct configuration (cline
). - Use FastMCP: Don’t reinvent the wheel. The high-level SDK is robust and simplifies development immensely.
- Log to stderr: When your server runs headless,
print(..., file=sys.stderr)
is your lifeline for debugging. - Know your client: The way you configure a tool for Claude Desktop can be different from how you configure it for a command-line utility.
In the end, we have a fantastic, zero-dependency retrieval tool that’s faster and simpler than any RAG pipeline for this use case. It’s a testament to the power and flexibility of the Model Context Protocol.
Get the Complete Code
The full implementation of this MCP docs server is available on GitHub: unlock-mcp/mcp-docs-server
The repository includes:
- Complete server implementation with FastMCP
- Shell wrapper scripts for cross-platform compatibility
- Example documentation file
- Installation and configuration instructions
- Troubleshooting guide
Clone it and start using it with your own documentation right away!
Additional Resources
Related Guides
A Developer's Guide to MCP Security: Beyond the Basics
Centralize your understanding of MCP security with this comprehensive guide. Learn practical steps for authenticating servers, preventing prompt injection, validating URIs, and managing secrets.
Building Your First MCP Server with Python
A step-by-step tutorial on how to create and run a basic Model Context Protocol (MCP) server using the Python SDK, FastMCP.
Connect Claude to Your Business Files with MCP
Step-by-step guide to setting up Claude AI to read, analyze, and work with your business documents and spreadsheets automatically.
Want More Step-by-Step Guides?
Get weekly implementation guides and practical MCP tutorials delivered to your inbox.
Subscribe for Weekly Guides