Streaming LLM Responses to the Browser with FastAPI and SSE
Stream a real model response to the browser: consume the model stream in Python, forward it through a FastAPI SSE endpoint, and render it live.
Writing
Long-form writing on development, tooling, and lessons from the workbench plus book reviews when a title sticks.
Topics
Jump to a category or open all categories.
Newest first. Use search or categories above to narrow down.
Stream a real model response to the browser: consume the model stream in Python, forward it through a FastAPI SSE endpoint, and render it live.
Build a small AI agent API: the tool calling loop, conversation memory, and the guardrails that keep an action taking agent safe and bounded.
A balanced, hands-on look at Claude Fable 5: how Anthropic's new flagship compares to Opus 4.8 and Sonnet 4.6 on price, context, and capabilities, and when the 2x premium is and is not worth paying.
A defensive engineering playbook for AI agents understand direct and indirect prompt injection, then lock agents down with least privilege, human-in-the-loop gates, sandboxing, validation
A plain-English technical guide to the Model Context Protocol, the problem it solves, hosts/clients/servers, tools vs resources vs prompts, transports, a minimal server, and when to adopt it.
A practical decision framework for the Ubuntu 26.04 vs 24.04 LTS question the real differences for developers, who should upgrade now, who should wait, and a safe upgrade path either way.
A developer-focused setup guide for Ubuntu 26.04 LTS servers a hardened baseline (SSH, firewall, unattended-upgrades), the kernel/runtime changes to watch versus 24.04, and a pre-migration checklist.
A hype-free developer guide to AI agents in 2026 to the agent loop, tools, memory, MCP, evaluation, guardrails, and the cost and failure modes that separate a demo from production.