Deploy an MCP Server: Docker and the Registry (Part 8)

Eight parts ago this was an empty folder. Now it is a server with well-designed tools, resources, and prompts, a web transport, OAuth in front, and a test suite behind. The last mile is making it exist for other people: configuration that survives environments, a container that builds the same way every time, a deployment that keeps SSE streams alive behind a proxy, a listing on the MCP registry so hosts can discover it, and connection instructions for every kind of client, including calling it straight from the Claude API. This part ships the notes server and closes the series.

★

What you will learn in Part 8

Twelve-factor configuration for MCP servers with pydantic-settings
A clean multi-stage Dockerfile built around uv
Proxy and scaling rules that keep Streamable HTTP happy
Publishing to the MCP registry with server.json
Connecting Claude Code, Claude Desktop, and the Claude API to your server

1. Configuration before container

Everything that varies between your laptop and production must come from the environment, and our server has accumulated exactly the usual suspects: where state lives, who issues tokens, what to bind. The pattern is the same typed settings object our FastAPI series built in the production part: one Pydantic settings class, validated at startup, failing loudly when something is missing instead of quietly defaulting in production.

"""config.py: every environment difference, typed and in one place."""
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    # Storage: the JSON file grew up into a database URL
    database_url: str = "sqlite:///notes.db"

    # Transport
    host: str = "127.0.0.1"
    port: int = 8000

    # Auth (Part 6)
    auth_issuer: str = ""
    auth_resource: str = ""

    model_config = {"env_prefix": "NOTES_"}


settings = Settings()

Note the storage line. notes.json earned its keep across seven parts of examples, but shared deployment is where it retires: concurrent writers and a JSON file is a corruption lottery. Swapping the two helper functions from Part 2 for SQLite or Postgres is twenty minutes precisely because every tool, resource, and test goes through those helpers, and the protocol tests from Part 7 will confirm the swap changed nothing visible.

2. The Dockerfile

The container story is standard modern Python with uv doing the heavy lifting: a build stage that installs the locked environment, a slim runtime stage that copies it in, a non-root user, and the frozen flag so an out-of-date lockfile fails the build instead of drifting silently. This mirrors the approach from our Docker guide, tuned for uv projects.

FROM ghcr.io/astral-sh/uv:python3.14-bookworm-slim AS build
WORKDIR /app
ENV UV_COMPILE_BYTECODE=1 UV_LINK_MODE=copy
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-install-project --no-dev
COPY . .
RUN uv sync --frozen --no-dev

FROM python:3.14-slim-bookworm
WORKDIR /app
RUN useradd --create-home appuser
COPY --from=build --chown=appuser:appuser /app /app
USER appuser
ENV PATH="/app/.venv/bin:$PATH" \
    NOTES_HOST=0.0.0.0 NOTES_PORT=8000
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Two intentional choices: binding 0.0.0.0 is finally correct here because the container boundary, not the bind address, is the wall, and the entry point runs the FastAPI app from Part 5 so the health endpoint and any other routes ship along with the MCP mount. Build it, run it, and smoke it with the Inspector CLI from Part 7 before anything fancier: docker build, docker run with your env file, then one tools/list against the mapped port.

3. Deployment: keep the streams alive

Streamable HTTP is ordinary HTTP plus long-lived SSE responses, and that plus is what proxies love to break. Three rules cover nearly every platform. Disable response buffering on the MCP path, or progress notifications will arrive in one useless lump after the call finishes. Raise read timeouts well past your longest tool call, because a proxy that cuts an SSE stream mid-call produces the least debuggable symptom in this series. And decide your scaling story honestly: stateful sessions need sticky routing or a single instance, while stateless mode, stateless_http=True from Part 5, scales flat at the cost of server-initiated notifications.

location /mcp/ {
    proxy_pass http://127.0.0.1:8000;
    proxy_http_version 1.1;

    # SSE essentials
    proxy_buffering off;
    proxy_cache off;
    proxy_read_timeout 300s;

    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-Proto $scheme;
}

Checkpoint

Progress notifications from a long tool arrive all at once, right before the final result, but only in production. What is the classic cause?

4. The MCP registry: make it discoverable

A deployed server nobody can find is a private API with extra steps. The MCP registry is the ecosystem's shared catalog: a public index of servers with verified namespaces, versioned metadata, and an API that hosts and aggregators read so users can discover and install servers without hunting through readmes. Publishing is deliberately lightweight. You describe the server in a server.json, claim a namespace by proving you control it, GitHub-based namespaces authenticate with a login, and push.

{
  "$schema": "https://static.modelcontextprotocol.io/schemas/server.schema.json",
  "name": "io.github.bishrulhaq/notes",
  "description": "Shared team notes: create, search, and review notes from any MCP host.",
  "version": "1.0.0",
  "remotes": [
    {
      "type": "streamable-http",
      "url": "https://notes.example.com/mcp"
    }
  ]
}

# The publisher CLI ships with the registry project
mcp-publisher login github
mcp-publisher publish

Servers installed locally instead of accessed remotely publish a packages section pointing at PyPI or npm artifacts rather than a remotes URL; ours is remote-first, so the URL is the product. Either way, treat the registry entry like an API contract: the description is marketing copy read by humans and ranking algorithms both, and the version field is a promise the next section makes precise.

5. Versioning: the contract has consumers now

The moment your server is listed, its tool schemas have users you will never meet, and Part 4's warning becomes operational policy. Semantic versioning maps cleanly onto MCP: adding a tool or an optional argument is a minor bump, fixing behavior behind stable schemas is a patch, and renaming a tool, removing one, or changing required arguments is a major bump that deserves a changelog entry and, ideally, a deprecation period where old and new names coexist. Published versions are immutable; you publish forward, never edit history. The playground below encodes the decision so the rule has no judgment calls left in it.

Python playground

def required_bump(changes: list[str]) -> str:
    """Decide the semver bump for a set of MCP contract changes."""
    BREAKING = {"tool_removed", "tool_renamed", "arg_now_required",
                "arg_type_changed", "result_shape_changed"}
    MINOR = {"tool_added", "optional_arg_added", "resource_added",
             "prompt_added", "annotation_added"}
    if any(c in BREAKING for c in changes):
        return "major"
    if any(c in MINOR for c in changes):
        return "minor"
    return "patch"

releases = [
    ["optional_arg_added"],
    ["tool_added", "prompt_added"],
    ["tool_renamed", "tool_added"],   # the rename decides it
    ["behavior_fix"],
]

version = [1, 0, 0]
for changes in releases:
    bump = required_bump(changes)
    index = {"major": 0, "minor": 1, "patch": 2}[bump]
    version[: index] = version[: index]            # keep left side
    version[index] += 1
    version[index + 1:] = [0] * (2 - index)
    print(f"{changes} -> {bump:6} -> {'.'.join(map(str, version))}")

6. Connecting every kind of client

Shipping ends with connection instructions, so here is the full set for the deployed server. Claude Code takes the remote URL directly, with a header flag for the Part 6 bearer when you are testing outside a full OAuth flow. Claude Desktop adds remote servers through its connectors settings, where users paste the URL and the OAuth dance happens in the browser. And applications can skip a host entirely: the Claude API's MCP connector lets you hand the model your server as part of a messages request.

# Claude Code, remote server
claude mcp add --transport http notes https://notes.example.com/mcp

# With an explicit bearer while testing
claude mcp add --transport http notes https://notes.example.com/mcp \
    --header "Authorization: Bearer $NOTES_TOKEN"

"""Calling the deployed server from code via the Claude API."""
import os

import anthropic

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Which notes mention the deploy checklist?",
    }],
    mcp_servers=[{
        "type": "url",
        "url": "https://notes.example.com/mcp",
        "name": "notes",
        "authorization_token": os.environ["NOTES_TOKEN"],
    }],
    betas=["mcp-client-2025-04-04"],
)

print(response.content[-1].text)

That last snippet is worth pausing on, because it closes the loop the whole series has been drawing. A model, reached through an API, discovers tools your Python published, chooses one, calls it across the network through the transport you configured, past the auth you enforce, into functions your tests pin down, and reasons over the result. Every hop in that sentence is something you built and understand down to the JSON.

Checkpoint

You need to rename search_notes to find_notes for consistency. Users have the server configured in three different hosts. What is the responsible release?

! Common mistakes to avoid

✕Shipping the JSON file storage to a multi-user deployment

✓Concurrent writers plus a flat file equals corruption. The settings object makes the database swap a config change; do it before traffic, not after.
✕Skipping the frozen flag in the image build

✓uv sync --frozen makes a stale lockfile a build failure. Without it, your container quietly resolves different versions than your tests ran against.
✕Default proxy settings in front of SSE

✓Buffering off, generous read timeouts, sticky sessions if stateful. The Part 5 walkthrough plus this part's nginx block is the whole checklist.
✕Editing a published registry version in place

✓Versions are immutable promises. Publish forward with a bump that matches the change, and keep a changelog humans can read.
✕Documentation that only covers one host

✓Ship connection instructions for Claude Code, Claude Desktop, the Inspector, and the API connector. Every host you document is a class of support requests you never receive.

The bottom line, for the part and the series

Shipping an MCP server is the same engineering as shipping any service, with three protocol-specific twists: SSE needs proxy care, sessions shape your scaling, and the registry plus semantic versioning turn your schemas into public promises. The notes server is now configured from the environment, containerized reproducibly, deployed behind a well-behaved proxy, discoverable in the registry, and callable from every client that matters.

★

The whole series in eight lines

Part 1: the protocol is three roles, three message shapes, and a handshake
Part 2: FastMCP turns typed Python functions into a real server
Part 3: resources and prompts put decisions where they belong
Part 4: tool design, schemas, errors, and structure decide agent success
Part 5: transports are pluggable; Streamable HTTP takes you remote
Part 6: OAuth audience binding plus injection discipline make it defensible
Part 7: in-memory protocol tests make correctness cheap to keep
Part 8: config, Docker, proxy care, and the registry make it real

Where to next: the natural extensions are wiring your server into a larger agent system, which our agent API part approaches from the application side, and going deeper on retrieval patterns from the RAG part if your server fronts a knowledge base. The protocol will keep moving; the architecture you now hold, and the habit of reading the wire when in doubt, will keep transferring.

? Frequently asked questions

Which platforms suit a first MCP deployment? +

Anything that runs a container and respects SSE: a small VPS with nginx, Fly.io, Railway, or Cloud Run with streaming enabled. Start single-instance and stateful; scale out only when usage, not anxiety, demands it.

Do I need the registry if my server is internal? +

The public registry is for public servers, but the registry software is open source and runnable privately, and a simple internal page with connection snippets covers small teams fine. The discipline that matters either way is versioned, documented releases.

How do stdio-only servers ship? +

As packages rather than deployments: publish to PyPI, list the package in the registry's packages section, and hosts launch it locally with uv or pipx. Parts 1 through 4 and 7 apply unchanged; you simply never needed 5, 6, and the proxy half of this part.

What should I monitor in production? +

Tool call latency and error rates per tool, session counts, auth rejections by reason, and stream duration percentiles. The first four tell you about your code; the last tells you when a proxy or timeout change starts silently cutting streams.

That completes Build an MCP Server in Python. If you build something with it, the notes server pattern, tools plus resources plus prompts over one well-tested core, stretches a very long way. Thanks for building along.