The Hardest Problems in Building Production AI Agents
· 25 min read
Every AI agent demo looks the same. The model calls a tool, gets a result, responds. Ship it. Then you try to run it against real infrastructure — and the demo falls apart in ways nobody warned you about.
We've spent over a year building Nova, an AI agent that operates real infrastructure for real teams. Not a chatbot that wraps API calls, but a system that investigates incidents, executes remediations, and composes across dozens of integrations. This post is about what we learned — the problems that made us rebuild entire subsystems, and the patterns that survived.