Metrics Reference¶

FastAgentic exports Prometheus-compatible metrics for monitoring agent performance, costs, and reliability.

Metrics Endpoint¶

# Prometheus format
curl http://localhost:8000/metrics

# JSON format
curl http://localhost:8000/metrics?format=json

Available Metrics¶

Request Metrics¶

Metric	Type	Labels	Description
`fastagentic_requests_total`	Counter	`method`, `path`, `status`	Total HTTP requests
`fastagentic_request_duration_seconds`	Histogram	`method`, `path`	Request latency
`fastagentic_requests_in_progress`	Gauge	`method`, `path`	Active requests

Run Metrics¶

Metric	Type	Labels	Description
`fastagentic_runs_total`	Counter	`endpoint`, `status`, `adapter`	Total agent runs
`fastagentic_runs_active`	Gauge	`endpoint`, `adapter`	Currently active runs
`fastagentic_run_duration_seconds`	Histogram	`endpoint`, `adapter`	Run execution time
`fastagentic_runs_resumed_total`	Counter	`endpoint`	Runs resumed from checkpoint

Token Metrics¶

Metric	Type	Labels	Description
`fastagentic_tokens_total`	Counter	`endpoint`, `type`, `model`	Total tokens used
`fastagentic_tokens_prompt`	Counter	`endpoint`, `model`	Input tokens
`fastagentic_tokens_completion`	Counter	`endpoint`, `model`	Output tokens

Cost Metrics¶

Metric	Type	Labels	Description
`fastagentic_cost_usd_total`	Counter	`endpoint`, `model`, `tenant`	Total cost in USD
`fastagentic_cost_per_run_usd`	Histogram	`endpoint`, `model`	Cost per run

Checkpoint Metrics¶

Metric	Type	Labels	Description
`fastagentic_checkpoints_total`	Counter	`endpoint`, `backend`	Checkpoints created
`fastagentic_checkpoint_size_bytes`	Histogram	`endpoint`	Checkpoint size
`fastagentic_checkpoint_duration_seconds`	Histogram	`backend`	Checkpoint write time

Streaming Metrics¶

Metric	Type	Labels	Description
`fastagentic_stream_connections`	Gauge	`transport`	Active stream connections
`fastagentic_stream_events_total`	Counter	`endpoint`, `event_type`	Events streamed
`fastagentic_stream_duration_seconds`	Histogram	`endpoint`	Stream duration

Policy Metrics¶

Metric	Type	Labels	Description
`fastagentic_rate_limit_hits_total`	Counter	`endpoint`, `user`	Rate limit triggers
`fastagentic_quota_usage_ratio`	Gauge	`endpoint`, `user`, `quota_type`	Quota utilization
`fastagentic_policy_decisions_total`	Counter	`endpoint`, `decision`	Allow/deny/downgrade

Health Metrics¶

Metric	Type	Labels	Description
`fastagentic_durable_store_up`	Gauge	`backend`	Store connectivity
`fastagentic_durable_store_latency_seconds`	Histogram	`backend`, `operation`	Store latency

Prometheus Configuration¶

prometheus.yml¶

scrape_configs:
  - job_name: 'fastagentic'
    scrape_interval: 15s
    static_configs:
      - targets: ['my-agent:8000']
    metrics_path: /metrics

ServiceMonitor (Kubernetes)¶

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: fastagentic
spec:
  selector:
    matchLabels:
      app: my-agent
  endpoints:
    - port: http
      path: /metrics
      interval: 15s

Grafana Dashboards¶

Overview Dashboard¶

{
  "panels": [
    {
      "title": "Requests per Second",
      "targets": [{
        "expr": "rate(fastagentic_requests_total[5m])"
      }]
    },
    {
      "title": "Error Rate",
      "targets": [{
        "expr": "rate(fastagentic_requests_total{status=~\"5..\"}[5m]) / rate(fastagentic_requests_total[5m])"
      }]
    },
    {
      "title": "P99 Latency",
      "targets": [{
        "expr": "histogram_quantile(0.99, rate(fastagentic_request_duration_seconds_bucket[5m]))"
      }]
    },
    {
      "title": "Active Runs",
      "targets": [{
        "expr": "sum(fastagentic_runs_active)"
      }]
    }
  ]
}

Cost Dashboard¶

{
  "panels": [
    {
      "title": "Daily Cost by Model",
      "targets": [{
        "expr": "sum(increase(fastagentic_cost_usd_total[24h])) by (model)"
      }]
    },
    {
      "title": "Cost per User",
      "targets": [{
        "expr": "sum(increase(fastagentic_cost_usd_total[24h])) by (tenant)"
      }]
    },
    {
      "title": "Token Usage",
      "targets": [{
        "expr": "sum(rate(fastagentic_tokens_total[1h])) by (type)"
      }]
    }
  ]
}

Key Queries¶

RED Metrics¶

# Rate
rate(fastagentic_requests_total[5m])

# Errors
rate(fastagentic_requests_total{status=~"5.."}[5m])

# Duration (P50, P95, P99)
histogram_quantile(0.50, rate(fastagentic_request_duration_seconds_bucket[5m]))
histogram_quantile(0.95, rate(fastagentic_request_duration_seconds_bucket[5m]))
histogram_quantile(0.99, rate(fastagentic_request_duration_seconds_bucket[5m]))

Cost Analysis¶

# Total daily cost
sum(increase(fastagentic_cost_usd_total[24h]))

# Cost by endpoint
sum(increase(fastagentic_cost_usd_total[24h])) by (endpoint)

# Average cost per run
sum(rate(fastagentic_cost_usd_total[1h])) / sum(rate(fastagentic_runs_total[1h]))

Capacity Planning¶

# Runs per minute
sum(rate(fastagentic_runs_total[5m])) * 60

# Average run duration
rate(fastagentic_run_duration_seconds_sum[5m]) / rate(fastagentic_run_duration_seconds_count[5m])

# Concurrent capacity
sum(fastagentic_runs_active) / count(fastagentic_runs_active)

Custom Metrics¶

Add custom metrics in your application:

from fastagentic.telemetry import metrics

# Counter
my_counter = metrics.counter(
    "my_custom_events_total",
    description="Custom event counter",
    labels=["event_type"]
)
my_counter.add(1, {"event_type": "success"})

# Histogram
my_histogram = metrics.histogram(
    "my_custom_duration_seconds",
    description="Custom duration",
    buckets=[0.1, 0.5, 1.0, 5.0, 10.0]
)
my_histogram.record(1.5)

Next Steps¶

Tracing - Distributed tracing setup
Alerting - Alert rules and runbooks
Environment Variables - Configuration