AgentMetrics Docs
Open source observability for AI agents. Track cost, latency, token usage, failures, and tool calls — running on your own infrastructure.
How it works
Two components work together.
Server — receives and stores run data from your agents. Dashboard at http://localhost:3099, API at http://localhost:8099. You run this yourself.
SDK or integration — a few lines of code in your agent that send run data to the server. Works with Python, JavaScript, and all major agent frameworks.
No account. No external service. Your data stays on your infrastructure.
Where to start
Starting fresh? Quickstart gets the server running and your first agent tracked in under five minutes.
Deploying for a team or to the cloud? Deploy covers Docker Compose, pip system service, Render, Fly.io, and Railway.
Server already running? Jump straight to your SDK or integration.
Integrations
| Integration | Install | Requires |
|---|---|---|
| Python SDK | pip install agentmetrics | Python 3.9+ |
| JavaScript SDK | npm install agentmetrics | Node.js 18+ |
| LangChain (Python) | pip install agentmetrics-langchain | Python 3.10+ |
| LangChain (JS) | npm install agentmetrics-langchain | Node.js 18+ |
| CrewAI | pip install agentmetrics-crewai | Python 3.10+ |
| LlamaIndex | pip install agentmetrics-llamaindex | Python 3.10+ |
| Anthropic Managed Agents (Python) | pip install agentmetrics-anthropic | Python 3.10+ |
| Anthropic Managed Agents (JS) | npm install agentmetrics-anthropic | Node.js 18+ |
| OpenAI Agents SDK | pip install agentmetrics-openai-agents | Python 3.10+ |
| AutoGen | pip install agentmetrics-autogen | Python 3.10+ |
| Hermes plugin | hermes plugins install agentmetrics-hermes | Hermes 2026.1.0+ |
| OpenClaw plugin | openclaw plugins install agentmetrics-openclaw | OpenClaw 2026.3.2+ |
What gets tracked
| Metric | Description |
|---|---|
| Duration | Wall-clock time from first call to final response |
| Cost | Estimated from token counts and model pricing tables |
| Token usage | Input, output, and cached tokens per run |
| Status | success or failure |
| Error messages | Full error text on failures |
| Tool calls | Count, names, and errors per run |
| Model | Which model was used (captured automatically via instrument()) |