Open source observability for AI agents. See every run, every failure, every dollar. The moment it happens.
Works with






The first sign is usually a surprise invoice.
Most developers find out too late.
AgentMetrics tells you first.
What you see from run one
Total runs
2,847
Avg cost
$0.031
Avg latency
2.3s
Success rate
94.2%
Runs / cost, last 12h
Switch to claude-3-haiku for 90% of runs
Save $612/mo
Performance
How long every step takes and exactly where your agent slows down.
Spot the bottleneck before your users do.
Cost
What each run costs, which model is spending the most, and why.
Stop paying for runs that do not work.
Quality
Which runs failed, what the top error signatures are, and how the failure rate trended this week.
Fix the right thing first.
Reliability
When retry storms hit, how bad they got, and what triggered them.
Know the moment a threshold is crossed.
Business
Team-wide fleet view, per-agent SLA monitoring, and AI-generated cost optimization recommendations.
The number your CFO will ask for.
pip install agentmetrics. Add the decorator. Run your agent. Works with LangChain, CrewAI, LlamaIndex, Anthropic, OpenAI Agents, OpenClaw, Hermes, and any custom Python code.
pip install agentmetrics
@agentmetrics.track()
def my_agent(task):
return llm.complete(task)Every token, every tool call, every retry — captured automatically from the first run. No sampling. No extra config.
See which agents burn the most, which fail silently, and what each run costs. The full picture, from run one to a million.
Before
$1,240/mo
After
$498/mo
Everyone who runs AgentMetrics finds something worth fixing in the first run.
Open source. MIT licensed.