Python SDK

Installation

Shell
pip install agentmetrics

Requires Python 3.9 or later.

Configuration

Call configure() once at startup before any tracking calls.

Python
import os
import agentmetrics

agentmetrics.configure(api_key=os.environ["AGENTMETRICS_API_KEY"])

Note

configure() must be called before any tracked function runs. Call it at module load time, not inside a function.

Tracking an agent

@agentmetrics.track()

Wrap your agent function with the @track decorator. Every call is tracked automatically.

Python
@agentmetrics.track(agent_id="my-agent")
def my_agent(task: str) -> str:
    # your agent logic
    return result

The decorator records start time, end time, success/failure status, and the full error message on exceptions. The agent function's behavior is unchanged: errors propagate normally.

Parameters

ParameterTypeRequiredDescription
agent_idstrYesIdentifier shown in the dashboard. Use lowercase with hyphens.
metadatadictNoArbitrary key-value pairs attached to every run.

Async functions

The decorator works on both sync and async functions:

Python
@agentmetrics.track(agent_id="async-agent")
async def my_async_agent(task: str) -> str:
    result = await some_llm_call(task)
    return result

Attaching metadata

Pass a metadata dict to tag every run from this agent:

Python
@agentmetrics.track(agent_id="support-agent", metadata={"env": "production"})
def support_agent(ticket_id: str, text: str) -> str:
    return handle_ticket(text)

Tracking steps

agentmetrics.step()

Use inside a tracked function to time named phases. Each step appears with its own latency in the dashboard.

Python
@agentmetrics.track(agent_id="pipeline")
def run(query: str) -> str:
    with agentmetrics.step("retrieve"):
        docs = vector_search(query)
    with agentmetrics.step("generate"):
        return call_llm(query, docs)

Works with async functions too:

Python
@agentmetrics.track(agent_id="async-pipeline")
async def run(query: str) -> str:
    async with agentmetrics.step("retrieve"):
        docs = await vector_search_async(query)
    return await call_llm_async(query, docs)

Tracking tools

agentmetrics.tool()

Use inside a tracked function to time individual tool calls. Tool names, durations, and errors are recorded separately.

Python
@agentmetrics.track(agent_id="research-agent")
def run(query: str) -> str:
    with agentmetrics.tool("web_search"):
        results = web_search(query)
    with agentmetrics.tool("code_interpreter"):
        output = run_code(results)
    return summarize(output)

Eval scores

agentmetrics.score()

Attach named numeric scores to the current run. Call inside any tracked function. Scores appear in the dashboard alongside latency and token data.

Python
@agentmetrics.track(agent_id="rag-agent")
def run(query: str) -> str:
    answer = generate_answer(query)
    agentmetrics.score("relevance", evaluate_relevance(query, answer))
    agentmetrics.score("groundedness", evaluate_groundedness(answer))
    return answer

Note

score() must be called inside a @track-decorated function. Calls outside a tracked run are logged as a warning and ignored.

Subagent tracking

When a @track-decorated function calls another @track-decorated function, the inner run is automatically linked to the outer run via parent_trace_id. No extra configuration is needed.

Python
@agentmetrics.track(agent_id="subagent")
def run_subagent(task: str) -> str:
    return call_llm(task)

@agentmetrics.track(agent_id="orchestrator")
def orchestrate(tasks: list[str]) -> list[str]:
    return [run_subagent(t) for t in tasks]

Each subagent run appears as a child of the orchestrator run in the dashboard, with its own tokens, latency, and status.

Getting the trace ID

agentmetrics.trace_id()

Returns the active trace ID from inside a tracked function. Use it to correlate AgentMetrics runs with your own logs.

Python
import logging

@agentmetrics.track(agent_id="my-agent")
def run(task: str) -> str:
    logging.info("trace_id=%s task=%s", agentmetrics.trace_id(), task)
    return call_llm(task)

Returns None when called outside a tracked function.

Auto-capturing token usage

agentmetrics.instrument()

Call instrument() once after configure(). It patches the OpenAI and Anthropic Python clients to automatically capture token counts and model names on every LLM call within a tracked run.

Python
import os
import agentmetrics

agentmetrics.configure(api_key=os.environ["AGENTMETRICS_API_KEY"])
agentmetrics.instrument()

No changes to your existing openai.chat.completions.create() or anthropic.messages.create() calls are needed. Token counts are forwarded to the run record automatically.

Supported providers: OpenAI, Anthropic, LiteLLM, Google Gemini, Cohere, Mistral, LangChain, LlamaIndex.

Error handling

Exceptions raised inside a tracked function are re-raised normally. AgentMetrics records the exception class name and message before re-raising, so no extra error handling is needed.

Python
@agentmetrics.track(agent_id="my-agent")
def my_agent(task):
    raise ValueError("something went wrong")  # tracked as failure, then re-raised