About

Built for developers and teams running AI agents.

AgentMetrics started with a simple observation: the hardest part of running AI agents in production isn't building them. It's knowing what's actually happening to them.

Costs spike unexpectedly. Failures go undetected until a user reports something broken. Token budgets get exhausted before the next billing cycle. And when something goes wrong, there's no trace of why.

We built AgentMetrics to give developers and teams real visibility into every agent run: cost, latency, failures, token usage, and tool calls. No infrastructure setup required, no sampling tradeoffs.

What we're building

An observability layer purpose-built for AI agents. Not generic APM re-branded. Not a wrapper around OpenTelemetry with an AI sticker on it.

AgentMetrics understands the specific shape of agent work: multi-step runs, token budgets, model switching, retry storms, tool call failures, and cost-per-output tradeoffs. We surface metrics that matter for agents specifically, and use them to surface concrete, actionable recommendations.

Open source first

AgentMetrics is MIT-licensed and fully self-hosted. The server, the dashboard, and the SDKs all run on your infrastructure. Your agent data never leaves your environment.

Inspect every line of what we collect, extend it to fit your stack, and contribute to the codebase.

View on GitHub →

Our principles

No sampling. Every run is captured. Sampling hides the tail events that matter most.
Three-line install. Import, configure, decorate. If it takes more than that to get your first run showing up, we've failed.
Data stays yours. We process your metrics to provide the Service. We don't sell it, share it, or train models with it without your consent.
Actionable over informational. Metrics without recommendations are just charts. Every observation should tell you what to do next.

Get in touch

We respond to every email. Reach us at support@agentmetrics.dev for questions, feedback, or to talk through whether AgentMetrics fits your stack.