Observability
mcp serve (proxy mode) emits native OTLP traces and metrics. Default-off: without an env var, behavior is byte-identical to 0.5.2 — structured logs on stderr + local audit trail in chrondb.
Want to try it now? Jump to the hands-on quickstart — it spins up Jaeger +
mcp servein 3 minutes and proves the full path end-to-end (includingtraceparentpropagation and metrics).This page is the reference — what each attribute means, how to configure each vendor, and the escape hatches for when something goes sideways.
Escape hatches (read this first)
Running mcp serve in production and nervous about turning OTel on? Two switches worth knowing:
Unset
OTEL_EXPORTER_OTLP_ENDPOINT→ telemetry is fully disabled. Behavior is identical to 0.5.2. No rebuild, no rollback, just remove the env var from the deploy.MCP_OTEL_INJECT_TRACEPARENT=0→ keeps traces and metrics on, only disablestraceparentinjection on outbound calls. Use this if some odd backend rejects unknown headers (W3Ctraceparentis standard, but bad servers exist).
Configuration — standard OTel env vars
Everything is driven by OTel env vars. Nothing lives in servers.json.
OTEL_EXPORTER_OTLP_ENDPOINT
Sole activator. Empty or unset = OTel off.
OTEL_EXPORTER_OTLP_PROTOCOL
grpc (default) or http/protobuf.
OTEL_EXPORTER_OTLP_HEADERS
CSV k1=v1,k2=v2. HTTP only per spec.
OTEL_SERVICE_NAME
Resource service.name. Default mcp.
OTEL_RESOURCE_ATTRIBUTES
Extra resource attributes, CSV format.
MCP_OTEL_INJECT_TRACEPARENT
0/false/no disables outbound injection.
Quick diagnostic: when OTel boots, the very first line on stderr is
[telemetry] OpenTelemetry initialized — endpoint=... protocol=.... It prints directly to stderr and does not respectMCP_LOG_LEVEL. If you don't see it, OTel didn't start — check the env var.
Vendor recipes
Honeycomb
The public Honeycomb ingest only accepts HTTP/protobuf — gRPC won't work. Don't forget the x-honeycomb-team header.
Grafana Tempo (self-hosted)
grpc is the OTel SDK default — you can omit OTEL_EXPORTER_OTLP_PROTOCOL if you want.
Datadog (via Agent)
The Datadog Agent exposes an OTLP receiver on 4317/4318 once enabled.
Local (development)
To bring up Jaeger + see the whole flow on your machine, follow the hands-on quickstart.
Span attributes
Every request produces a single root span mcp.request with:
otel.kind
server
mcp.method
JSON-RPC method (tools/call, tools/list, resources/read, …)
mcp.transport
serve:http or serve:stdio
mcp.identity
Authenticated subject (JWT sub, bearer-token name, or anonymous)
mcp.server
Backend alias, only after routing resolves (tools/call, resources/read, prompts/get)
mcp.tool
Backend tool name
mcp.status
ok or error
Span duration = total time inside dispatch_request. Includes ACL evaluation, the brief proxy lock, the backend connection, and the backend call itself — useful to see where latency is going.
Inbound traceparent (from the client) is honored: the mcp span becomes a child of the client's span, no new trace is created. Outbound, mcp injects traceparent/tracestate automatically — an instrumented backend continues the trace.
Metrics
mcp.proxy.requests
counter
—
mcp.method, mcp.transport, mcp.status, mcp.identity, mcp.server, mcp.tool
mcp.proxy.request.duration
histogram
ms
same as the counter
mcp.proxy.classifier.cache.hits
counter
—
mcp.server
mcp.proxy.classifier.cache.misses
counter
—
mcp.server
mcp.proxy.backends.connected
gauge
—
—
mcp.proxy.sessions.active
gauge
—
—
* mcp.server and mcp.tool only appear when the request resolves to a backend (absent on auth/failure, unknown methods, malformed payload). The mcp.tool label carries the backend tool name (un-namespaced) — the namespace lives in mcp.server.
The PeriodicReader exports every 60s (OTel SDK default). When testing locally, wait ~65s after the first request before checking the exporter.
Cardinality — heads-up
mcp.identity discriminates per authenticated subject. If your subjects are per-user UUID JWTs, you'll generate many series. Honeycomb / Tempo / Datadog handle it, but it's a deliberate trade-off — if you only care about role-level breakdowns, drop the label upstream (in your collector config).
What it does NOT do
No Sentry / panic tracking. Errors flow as
mcp.status=erroron the span. Sentry is tracked in a separate issue.No continuous profiling.
No OTLP logs signal. The audit trail stays in chrondb (
mcp logs); the structured log keeps going to stderr, controlled byMCP_LOG_LEVEL/MCP_LOG_FORMAT.
Troubleshooting
If something went wrong, the quickstart has a longer checklist. In short:
No
[telemetry] OpenTelemetry initializedon stderr = the env var wasn't read. CheckOTEL_EXPORTER_OTLP_ENDPOINT.It initialized but nothing arrives = wrong endpoint, firewall, or a vendor that only accepts HTTP (
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf).traceparentnot reaching the backend = either telemetry didn't initialize, orMCP_OTEL_INJECT_TRACEPARENT=0is set.
Last updated
Was this helpful?