Skip to content

Monitoring & Observability

Every log line is a JSON object. Key fields:

FieldTypeDescription
msgstringEvent type — "request", "payment_settled", "settlement_failed", etc.
timestampstringISO 8601 timestamp
levelstring"debug", "info", "warn", "error"
methodstringHTTP method
pathstringRequest path
routestringMatched route pattern, e.g. "POST /v1/messages"
statusnumberHTTP status returned to client
duration_msnumberTotal request duration
pricestringPrice charged, e.g. "$0.075"
payerstringPayer wallet address
tx_hashstringOn-chain transaction hash
errorstringError message, if any

Log levels: info (successful requests), warn (402s, rate limits), error (settlement/upstream failures), debug (verbose, disable in production).

Enable with:

gateway:
metrics: true
MetricLabelsDescription
tollbooth_requests_totalroute, method, statusTotal requests
tollbooth_payments_totalroute, outcomePayment attempts (success, rejected, missing)
tollbooth_settlements_totalstrategy, outcomeSettlement attempts (success, failure)
tollbooth_cache_hits_totalrouteVerification cache hits
tollbooth_cache_misses_totalrouteVerification cache misses
tollbooth_rate_limit_blocks_totalrouteRequests blocked by rate limiting
tollbooth_upstream_errors_totalupstream, statusNon-2xx responses from upstreams
tollbooth_revenue_usd_totalrouteCumulative revenue in USD
MetricLabelsDescription
tollbooth_request_duration_secondsroute, methodEnd-to-end request latency
tollbooth_settlement_duration_secondsstrategySettlement latency
tollbooth_upstream_duration_secondsupstreamUpstream response latency
MetricDescription
tollbooth_active_requestsCurrently in-flight requests
# p95 latency per route
histogram_quantile(0.95, sum by (le, route, method) (rate(tollbooth_request_duration_seconds_bucket[5m])))
# 402 rate
sum(rate(tollbooth_requests_total{status="402"}[5m])) / sum(rate(tollbooth_requests_total[5m]))
# Settlement success rate
rate(tollbooth_settlements_total{outcome="success"}[5m]) / rate(tollbooth_settlements_total[5m])
# Revenue rate by route
rate(tollbooth_revenue_usd_total[5m])
scrape_configs:
- job_name: tollbooth
metrics_path: /metrics
static_configs:
- targets: ["localhost:3000"]

High 402 rate — Check if clients are sending the payment-signature header. Verify route prices haven’t changed. Check the error field on status: 402 log lines.

Settlement failures — Check facilitator endpoint health. Look at duration_ms on settlement_failed log lines for timeouts.

High upstream latency — Compare duration_ms with upstream timing. If they’re close, tollbooth isn’t the bottleneck.


Next: Scaling & Shared Stores →