Python SDK
Installation
pip install agentmetrics
Requires Python 3.9 or later.
Configuration
Call configure() once at startup before any tracking calls.
import os
import agentmetrics
agentmetrics.configure(api_key=os.environ["AGENTMETRICS_API_KEY"])
Note
configure() must be called before any tracked function runs. Call it at module load time, not inside a function.
Tracking an agent
@agentmetrics.track()
Wrap your agent function with the @track decorator. Every call is tracked automatically.
@agentmetrics.track(agent_id="my-agent")
def my_agent(task: str) -> str:
# your agent logic
return result
The decorator records start time, end time, success/failure status, and the full error message on exceptions. The agent function's behavior is unchanged: errors propagate normally.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
agent_id | str | Yes | Identifier shown in the dashboard. Use lowercase with hyphens. |
metadata | dict | No | Arbitrary key-value pairs attached to every run. |
Async functions
The decorator works on both sync and async functions:
@agentmetrics.track(agent_id="async-agent")
async def my_async_agent(task: str) -> str:
result = await some_llm_call(task)
return result
Attaching metadata
Pass a metadata dict to tag every run from this agent:
@agentmetrics.track(agent_id="support-agent", metadata={"env": "production"})
def support_agent(ticket_id: str, text: str) -> str:
return handle_ticket(text)
Tracking steps
agentmetrics.step()
Use inside a tracked function to time named phases. Each step appears with its own latency in the dashboard.
@agentmetrics.track(agent_id="pipeline")
def run(query: str) -> str:
with agentmetrics.step("retrieve"):
docs = vector_search(query)
with agentmetrics.step("generate"):
return call_llm(query, docs)
Works with async functions too:
@agentmetrics.track(agent_id="async-pipeline")
async def run(query: str) -> str:
async with agentmetrics.step("retrieve"):
docs = await vector_search_async(query)
return await call_llm_async(query, docs)
Tracking tools
agentmetrics.tool()
Use inside a tracked function to time individual tool calls. Tool names, durations, and errors are recorded separately.
@agentmetrics.track(agent_id="research-agent")
def run(query: str) -> str:
with agentmetrics.tool("web_search"):
results = web_search(query)
with agentmetrics.tool("code_interpreter"):
output = run_code(results)
return summarize(output)
Eval scores
agentmetrics.score()
Attach named numeric scores to the current run. Call inside any tracked function. Scores appear in the dashboard alongside latency and token data.
@agentmetrics.track(agent_id="rag-agent")
def run(query: str) -> str:
answer = generate_answer(query)
agentmetrics.score("relevance", evaluate_relevance(query, answer))
agentmetrics.score("groundedness", evaluate_groundedness(answer))
return answer
Note
score() must be called inside a @track-decorated function. Calls outside a tracked run are logged as a warning and ignored.
Subagent tracking
When a @track-decorated function calls another @track-decorated function, the inner run is automatically linked to the outer run via parent_trace_id. No extra configuration is needed.
@agentmetrics.track(agent_id="subagent")
def run_subagent(task: str) -> str:
return call_llm(task)
@agentmetrics.track(agent_id="orchestrator")
def orchestrate(tasks: list[str]) -> list[str]:
return [run_subagent(t) for t in tasks]
Each subagent run appears as a child of the orchestrator run in the dashboard, with its own tokens, latency, and status.
Getting the trace ID
agentmetrics.trace_id()
Returns the active trace ID from inside a tracked function. Use it to correlate AgentMetrics runs with your own logs.
import logging
@agentmetrics.track(agent_id="my-agent")
def run(task: str) -> str:
logging.info("trace_id=%s task=%s", agentmetrics.trace_id(), task)
return call_llm(task)
Returns None when called outside a tracked function.
Auto-capturing token usage
agentmetrics.instrument()
Call instrument() once after configure(). It patches the OpenAI and Anthropic Python clients to automatically capture token counts and model names on every LLM call within a tracked run.
import os
import agentmetrics
agentmetrics.configure(api_key=os.environ["AGENTMETRICS_API_KEY"])
agentmetrics.instrument()
No changes to your existing openai.chat.completions.create() or anthropic.messages.create() calls are needed. Token counts are forwarded to the run record automatically.
Supported providers: OpenAI, Anthropic, LiteLLM, Google Gemini, Cohere, Mistral, LangChain, LlamaIndex.
Error handling
Exceptions raised inside a tracked function are re-raised normally. AgentMetrics records the exception class name and message before re-raising, so no extra error handling is needed.
@agentmetrics.track(agent_id="my-agent")
def my_agent(task):
raise ValueError("something went wrong") # tracked as failure, then re-raised