Testing FastAPI with pytest (Series Part 7)

Part 7 protects everything you have built. An LLM app changes fast, and without tests every change is a gamble. The good news is that FastAPI is built for testing: routes are plain functions, dependencies can be overridden, and the test client drives the whole app in process. This part shows a practical testing strategy with pytest.

★

What you will learn

Driving the app with FastAPI TestClient
Overriding dependencies to isolate tests from real services
Asserting on status codes and validation errors
Testing async code and a streaming endpoint

1. The test client

TestClient runs your app without a network, calling routes the way a real client would. A test reads like a request: send something, assert on the response.

from fastapi.testclient import TestClient
from llm_app.main import app

client = TestClient(app)

def test_create_project_validates_name():
    resp = client.post("/projects", json={"name": "x"})  # too short
    assert resp.status_code == 422               # FastAPI validation error
    assert resp.json()["detail"][0]["loc"][-1] == "name"

def test_create_project_ok():
    resp = client.post("/projects", json={"name": "Search API"})
    assert resp.status_code == 201
    assert resp.json()["name"] == "Search API"

Notice you get a 422 for invalid input for free, because the Pydantic model from Part 2 rejects it. You are testing your contract, not reimplementing validation.

2. Overriding dependencies

Tests should not call a real model API or database. Because Part 4 and Part 5 put those behind dependencies, you can swap them for fakes with dependency_overrides. The route does not change.

from llm_app.main import app
from llm_app.deps import require_token

def _allow_all() -> None:
    return None

def test_secure_route_with_override():
    app.dependency_overrides[require_token] = _allow_all
    client = TestClient(app)
    assert client.get("/secure").status_code == 200
    app.dependency_overrides.clear()

💡

Pro tip

Put shared setup like the client and common overrides in a conftest.py fixture so every test starts from a clean, isolated app.

3. Testing async code and streams

For async helpers, use the pytest-asyncio plugin and mark the test async. For a streaming endpoint, the test client lets you read the streamed body and assert on the chunks.

import pytest

@pytest.mark.asyncio
async def test_reading_minutes():
    from llm_app.text import reading_minutes  # an async helper
    assert await reading_minutes("word " * 220) == 1.0

def test_stream_endpoint():
    with TestClient(app) as client:
        with client.stream("GET", "/stream") as resp:
            body = "".join(resp.iter_text())
    assert "streaming" in body

Checkpoint

What status code does FastAPI return when a request body fails Pydantic validation?

4. Fixtures give every test a clean start

Tests must not share mutable state, or one test passing depends on another running first. A pytest fixture builds a fresh client and tears down any overrides afterward, so each test starts from the same known point. Put shared fixtures in conftest.py and they are available everywhere without imports. This is where the application factory from Part 1 pays off, since you can build a brand new app per test.

# tests/conftest.py
import pytest
from fastapi.testclient import TestClient
from llm_app.main import create_app

@pytest.fixture
def client():
    app = create_app()
    with TestClient(app) as c:
        yield c
    app.dependency_overrides.clear()

5. Test the unhappy paths

It is easy to test only the case where everything works. The bugs live in the other cases: missing fields, wrong types, unauthorized callers, and not found records. Those paths are exactly where status codes matter, and asserting on them locks in the behavior your frontend depends on. A good suite has at least as many failure tests as success tests.

def test_missing_token_is_rejected(client):
    assert client.get("/secure").status_code in (401, 403)

def test_unknown_project_is_404(client):
    assert client.get("/projects/9999").status_code == 404

def test_wrong_type_is_422(client):
    resp = client.post("/projects", json={"name": 123})  # name must be a string
    assert resp.status_code == 422

6. Never call the real model in a test

Tests should be fast, deterministic, and free. A real model call is none of those: it is slow, its output varies, and it costs money. Because Part 8 puts the model behind a small client injected as a dependency, you can override that dependency with a fake that returns a canned answer. Your endpoint logic runs for real; only the network call is replaced.

from llm_app.main import create_app
from llm_app.deps import get_model_client
from fastapi.testclient import TestClient

class FakeModel:
    def ask(self, prompt: str) -> str:
        return "canned answer"

def test_chat_uses_model():
    app = create_app()
    app.dependency_overrides[get_model_client] = lambda: FakeModel()
    client = TestClient(app)
    resp = client.post("/chat", json={"question": "hi"})
    assert resp.json()["answer"] == "canned answer"
    app.dependency_overrides.clear()

Keep a small number of real, clearly marked integration tests that do hit the model, and run them rarely, for example before a release. The fast fake based tests are what you run on every change.

7. Test behavior, not implementation

Aim your tests at what the endpoint promises, not at how it happens to be written today. A test that asserts on the status code and response shape keeps passing when you refactor the internals, while a test coupled to private functions breaks on every change and teaches you nothing. Chase confidence, not a coverage number: a high percentage made of brittle tests is worse than a smaller suite that exercises the paths that matter.

💡

Tip

Wire tests into CI

Run uv run pytest on every push, and treat a red suite as a blocked merge. The point of all this leverage is that a failing test catches the regression before your users do.

8. Parametrize to cover many cases cheaply

When the same logic should behave the same way across many inputs, do not copy the test. Parametrize it: pytest runs the body once per case and reports each separately, so a single small test covers the whole table of valid and invalid inputs. This is how you get broad validation coverage without a wall of near identical functions.

import pytest

@pytest.mark.parametrize("name,expected", [
    ("Search API", 201),   # valid
    ("ab", 422),           # too short
    ("", 422),             # empty
    ("x" * 200, 422),      # too long
])
def test_project_name_rules(client, name, expected):
    assert client.post("/projects", json={"name": name}).status_code == expected

Each row is reported as its own test, so when one breaks you see exactly which input regressed, not just that the function failed somewhere.

9. The shape of a healthy suite

A useful rule of thumb is the test pyramid: many fast unit tests over pure functions, a solid layer of endpoint tests through the test client, and only a few slow integration tests that touch real external services. Most of your tests should run in milliseconds with no network, which is what makes running them on every save painless. The handful of real model and database tests are valuable but slow, so mark them and run them deliberately, not on every keystroke.

import pytest

@pytest.mark.integration   # registered in pyproject under [tool.pytest.ini_options]
def test_real_model_call_smoke():
    # hits the real API; run with: uv run pytest -m integration
    ...

Run the fast suite constantly and the marked integration suite on a schedule or before release. The goal is a green run you trust enough to refactor against, which is the entire point of writing tests for an app that changes as fast as an LLM product does.

The bottom line

Testing FastAPI is mostly about leverage you already built. Pydantic gives you validation tests for free, dependency injection lets you isolate from real services, and the test client drives the app in process. The payoff is confidence: a green suite is permission to refactor aggressively, add features quickly, and upgrade dependencies without holding your breath, because the behavior your users depend on is pinned down by tests that run in seconds. That confidence matters more in an LLM app than almost anywhere else, because these systems change constantly, prompts get tuned, models get swapped, retrieval gets rebuilt, and each change is a chance to break something subtle. Fast, deterministic tests that never call the real model are what let you move at that pace safely. Cover the unhappy paths as thoroughly as the happy ones, parametrize to keep coverage cheap, and keep the slow integration tests few and clearly marked. Lean on fixtures for a clean app per test, override the model client so tests are fast and free, and aim your assertions at behavior rather than implementation so a refactor does not turn your suite red for no reason. With that suite in place, you can build the rest of the series boldly. Now we are ready for the heart of the series: calling a language model.

? Frequently asked questions

How do I mock the model API? +

Wrap the model call in a dependency or small client class, then override or monkeypatch it in tests to return canned responses. Part 8 builds exactly such a client.

Unit tests or integration tests? +

Both. Unit test pure logic directly, and use the test client for endpoint behavior. Keep slow, real network tests few and clearly marked.

How much coverage is enough? +

Chase confidence over a percentage. A smaller suite that exercises the real success and failure paths is worth more than a high number made of brittle tests tied to implementation details.

Why did my test fail only on the second run? +

Almost always leftover state. Make sure each test starts from a fresh app and clears any dependency overrides afterward, which is exactly what a client fixture does for you.

Up next: Part 8, calling LLMs from Python.

Testing FastAPI the Right Way: pytest, the Test Client, and Validation

What you will learn

1. The test client

2. Overriding dependencies

3. Testing async code and streams

4. Fixtures give every test a clean start

5. Test the unhappy paths

6. Never call the real model in a test

7. Test behavior, not implementation

Wire tests into CI

8. Parametrize to cover many cases cheaply

9. The shape of a healthy suite

The bottom line

? Frequently asked questions

Bishrul Haq

Tags

Share

Comments

Related posts

FastAPI Fundamentals: Routing, Pydantic Models, and Dependency Injection

FastAPI in Production: Settings, Auth, Middleware, and Project Structure

Streaming and Background Work in FastAPI: SSE and BackgroundTasks