MCP Architecture and JSON-RPC Explained (Series Part 1)

This series builds one real MCP server in Python, from an empty folder to a tested, secured, deployed service that Claude and other AI agents can call. Part 1 is the foundation: what the Model Context Protocol actually is, who talks to whom, and what the messages on the wire look like. Most MCP tutorials skip this and jump straight to a decorator. Then the first time something breaks, you are staring at a silent timeout with no mental model of what should have happened. We are doing it the other way around. Once you can read the protocol, every later part of this series, and every confusing bug, becomes legible.

★

What you will learn in Part 1

The problem MCP solves and why it beat bespoke integrations
The three roles: host, client, and server, and which one you are building
JSON-RPC 2.0: requests, responses, and notifications you can read by eye
The initialize handshake and capability negotiation, step by step
The three primitives, tools, resources, and prompts, and who controls each

Info

Who this is for

You know basic Python and the terminal. You do not need prior MCP experience. If you followed our From Python to Production LLM Apps series, this track picks up naturally where the agent part left off, but it also stands alone.

1. The problem MCP actually solves

Before late 2024, connecting an AI assistant to your tools meant writing a custom integration for every pair. Your ticketing system needed one plugin for Claude, another for an IDE assistant, a third for an internal chatbot. Every vendor had its own function-calling format, its own auth story, and its own way of passing results back. With M assistants and N tools you were maintaining M times N integrations, and every new model launch reset the work.

The Model Context Protocol, open sourced by Anthropic in November 2024, collapses that grid. A tool speaks MCP once, as a server. An assistant speaks MCP once, as a client. Any client can then talk to any server, the same way any browser can talk to any web server. Adoption followed quickly: the major model vendors signed on through 2025, an official registry appeared, and by 2026 the ecosystem counts thousands of public servers. The protocol you are about to learn is not a niche experiment; it is the standard integration layer for agents.

One sentence is worth keeping in mind through the whole series: MCP standardizes the conversation between an AI application and the outside world. It does not run models, it does not pick which tool to call, and it does not replace your API. It is plumbing, and like all good plumbing it is boring, predictable, and everywhere.

2. Hosts, clients, and servers

MCP names three roles, and the names matter because the security model in Part 6 and the testing story in Part 7 both hang off them. The host is the AI application the user actually touches: Claude Desktop, Claude Code, an IDE, or your own chat app. The host embeds one or more clients. A client maintains exactly one stateful connection to one server. The server is the program you will write in this series: it exposes capabilities and answers requests.

The three MCP roles
Role	What it is	Examples	You build it?
Host	The AI app the user interacts with; owns the model loop and permissions	Claude Desktop, Claude Code, your chat UI	Sometimes
Client	One stateful connection to one server, embedded in the host	The connector Claude Code spins up per configured server	Rarely
Server	Exposes tools, resources, and prompts over the protocol	GitHub server, Postgres server, the one we build here	Yes, this series

Notice what the model does not do: it never talks to your server directly. The host decides which servers to connect, asks the user for permission, forwards the model's tool call through the right client, and returns the result into the model's context. That indirection is deliberate. It means your server never sees the conversation, only the calls addressed to it, and the host can enforce approval prompts and isolation between servers. When you design tools in Part 4, you are really designing for two audiences at once: the model that reads your schemas, and the host that mediates everything.

3. The wire format: JSON-RPC 2.0

Everything in MCP travels as JSON-RPC 2.0 messages. If you have never used JSON-RPC, the good news is that there is almost nothing to learn. There are three message shapes. A request carries an id, a method name, and params, and expects exactly one response. A response carries the same id and either a result or an error, never both. A notification has a method and params but no id, and expects no reply at all.

// Request: the client wants to call a tool
{ "jsonrpc": "2.0", "id": 3, "method": "tools/call",
  "params": { "name": "search_notes", "arguments": { "query": "deploy" } } }

// Response: success carries "result", failure carries "error"
{ "jsonrpc": "2.0", "id": 3,
  "result": { "content": [ { "type": "text", "text": "2 notes found" } ] } }

// Notification: fire and forget, no id, no reply
{ "jsonrpc": "2.0", "method": "notifications/initialized" }

That is the entire grammar. Every MCP feature you will meet, listing tools, calling them, reading resources, streaming progress, is one of these three shapes with a different method string. The id field is how a client matches answers to questions when several requests are in flight at once, which is also why a response must echo the id of its request exactly. Build a few messages yourself in the playground below; this is pure Python, running in your browser, and the habit of constructing these by hand pays off the first time you debug a real server.

Python playground

Checkpoint

A JSON-RPC message arrives with a method and params but no id. What is it, and what must the server send back?

4. The lifecycle: initialize before everything

An MCP session is stateful and it opens with a handshake. The client sends an initialize request declaring the protocol version it speaks, its own capabilities, and its identity. The server answers with the version it will use, its capabilities, and its identity. The client then fires the initialized notification, and only after that may normal traffic flow. Capabilities are the negotiation part: a server that declares tools but not resources is telling the client not to bother sending resources/list. Step through a complete session below, one message at a time.

Protocol walkthrough

A complete session: handshake, discovery, one tool call

Client → Server initialize request

The client opens the session: the protocol version it speaks, what it can do, and who it is. Nothing else is allowed before this.

{
  "jsonrpc": "2.0", "id": 1, "method": "initialize",
  "params": {
    "protocolVersion": "2025-06-18",
    "capabilities": { "sampling": {} },
    "clientInfo": { "name": "claude-code", "version": "2.1.0" }
  }
}

Server → Client initialize response

The server agrees on a version and declares its capabilities. This server offers tools and can notify when the tool list changes.

{
  "jsonrpc": "2.0", "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": { "tools": { "listChanged": true } },
    "serverInfo": { "name": "notes-server", "version": "0.1.0" }
  }
}

Client → Server initialized notification

A notification, so no id and no reply. From here on the session is open for normal requests.

{ "jsonrpc": "2.0", "method": "notifications/initialized" }

Client → Server tools/list request

Discovery. The client asks what tools exist so the host can show them to the model.

{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }

Server → Client tools/list response

Each tool ships a name, a description the model reads, and a JSON Schema for its arguments. Part 4 is entirely about getting these right.

{
  "jsonrpc": "2.0", "id": 2,
  "result": { "tools": [ {
    "name": "search_notes",
    "description": "Search saved notes by keyword.",
    "inputSchema": {
      "type": "object",
      "properties": { "query": { "type": "string" } },
      "required": ["query"]
    }
  } ] }
}

Client → Server tools/call request

The model decided to use the tool; the host forwards the call through the client with concrete arguments.

{
  "jsonrpc": "2.0", "id": 3, "method": "tools/call",
  "params": { "name": "search_notes",
               "arguments": { "query": "deploy checklist" } }
}

Server → Client tools/call response

Results come back as content blocks. isError marks tool-level failure inside a successful protocol exchange, a distinction Part 4 leans on heavily.

{
  "jsonrpc": "2.0", "id": 3,
  "result": {
    "content": [ { "type": "text",
                    "text": "1. Deploy checklist (updated Tue)" } ],
    "isError": false
  }
}

Read that exchange twice and you know more than most people shipping MCP servers today. Two details deserve emphasis. First, the protocol version is a date string, and client and server must agree on one during initialize; version drift is a real source of mysterious failures after SDK upgrades. Second, the server you build never initiates this dance. It answers. That shapes how you will test in Part 7: a test is just a scripted client running exactly this sequence against your code.

5. The three primitives, and who controls each

Everything a server offers falls into three buckets, and the cleanest way to keep them straight is to ask who decides when each is used. Tools are model-controlled: the model reads their descriptions and chooses to call them during a conversation. Resources are application-controlled: the host decides which documents or data to load into context, often with the user picking from a list. Prompts are user-controlled: reusable templates a person invokes explicitly, like a slash command.

The three MCP primitives
Primitive	Controlled by	Typical use	Series part
Tools	The model	Search tickets, create a note, run a query	Parts 2 and 4
Resources	The application	Expose files, configs, or records as context	Part 3
Prompts	The user	Reusable workflows like "review this PR"	Part 3

Teams routinely cram everything into tools because tools are what the demos show. That works, but it pushes decisions onto the model that the application or the user should own, and it bloats the model's context with tool descriptions it rarely needs. Knowing all three primitives, and choosing deliberately, is one of the quiet skills that separates a server agents enjoy using from one they fumble. We will build all three into the same server in Parts 2 and 3.

Checkpoint

Your server exposes project documentation that users attach to a conversation when they choose. Which primitive fits best?

6. Transports: stdio now, Streamable HTTP later

The protocol says nothing about how bytes move; that is the transport's job, and there are two that matter. With stdio, the host launches your server as a child process and speaks JSON-RPC over stdin and stdout, one message per line. It is trivially secure, needs no network, and is the right default for anything running on the user's machine. With Streamable HTTP, your server is a web service: requests arrive as HTTP POSTs and the server can stream responses back over server-sent events when a call produces progress updates or takes time.

The practical rule is simple. Local and personal: stdio. Shared, remote, or multi-user: Streamable HTTP. We build on stdio in Parts 2 through 4 because the loop is faster, then move the same server to Streamable HTTP in Part 5 without touching the tool code, which is the payoff of the SDK abstracting transports away. One habit to adopt from day one: a stdio server must never print to stdout, because stdout is the protocol channel. Log to stderr. Forgetting this is the single most common first-server bug, and we will meet it again in Part 7.

⚠

Warning

stdout is sacred on stdio

On the stdio transport, anything your code prints to stdout is parsed as a protocol message and will corrupt the session. Use logging configured to stderr, never print(), inside server code.

7. The roadmap for this series

Here is where we are going. In Part 2 you will stand up a working server with the official Python SDK and connect it to Claude Code and Claude Desktop. Part 3 adds resources and prompts. Part 4 is tool design: schemas, errors, and structured output, the part that decides whether agents actually succeed with your server. Part 5 moves to Streamable HTTP and mounts the server inside a FastAPI app. Part 6 locks it down with OAuth 2.1 and defenses against prompt injection and tool poisoning. Part 7 covers testing with the Inspector and pytest, and Part 8 ships it with Docker and lists it on the MCP registry.

If you want the protocol context from a higher altitude first, our earlier explainer MCP Explained Without Hype covers the ecosystem view; this series is the builder's view. Everything from here on is hands-on.

The bottom line

MCP is three roles, three message shapes, three primitives, and a handshake. The host embeds clients; each client holds one session with one server; everything on the wire is JSON-RPC; sessions open with initialize; servers offer tools, resources, and prompts over a transport that is either stdio or Streamable HTTP. That is the whole protocol at the altitude that matters, and you can now read any MCP exchange that crosses your screen. Next, we write the server.

? Frequently asked questions

Do I need to memorize JSON-RPC to build servers? +

No. The SDK hides it almost completely, as Part 2 shows. You learn it so that when something fails you can read the actual messages in the Inspector and know which side is wrong, instead of guessing.

Is MCP only for Claude? +

No. It started at Anthropic but the major AI vendors adopted it through 2025, and the spec is developed in the open. A server you build in this series works with any MCP-capable host.

How is MCP different from function calling? +

Function calling is how a model expresses intent to call something within one vendor's API. MCP standardizes how applications discover and execute those capabilities across processes and networks, with lifecycle, permissions, and transports specified. They cooperate rather than compete.

Which protocol version does this series target? +

The examples use the 2025-06-18 revision and the current official Python SDK. Where newer revisions add features, the text says so explicitly, and the concepts carry across revisions.

Up next: Part 2, your first MCP server with FastMCP and uv.