For the complete documentation index, see llms.txt. This page is also available as Markdown.

Kubernetes

Reference manifests for running the MCP proxy in a Kubernetes cluster.

Quick start

# 1. Edit the ConfigMap with your server configuration
vim deploy/kubernetes/configmap.yaml

# 2. Create the Secret with your API tokens (not included in kustomize)
kubectl create namespace mcp
kubectl -n mcp create secret generic mcp-secrets \
  --from-literal=sentry-token=sntrys_...

# 3. Apply the manifests
kubectl apply -k deploy/kubernetes/

This creates:

  • mcp namespace

  • mcp-proxy Deployment (1 replica)

  • mcp-proxy ClusterIP Service on port 8080

  • mcp-config ConfigMap with your server configuration

The Secret must be created separately (step 2) to avoid committing real tokens to the repo.

Configuration

Server config via ConfigMap

Edit deploy/kubernetes/configmap.yaml with your MCP servers:

The proxy resolves ${VAR} placeholders from environment variables at startup. This keeps tokens out of the ConfigMap.

Secrets for tokens

Create the secret with your real tokens:

Then reference each token in the Deployment env:

OAuth tokens via Secret

For backends that use OAuth (Sentry, Honeycomb, GitHub Copilot, etc.), mcp keeps issued access/refresh tokens and dynamic-client registrations in auth.json. In a pod with a read-only root filesystem, mounting a writable auth.json is awkward — instead, inject the contents directly via MCP_AUTH_CONFIG.

1. Run the OAuth flow once on a workstation:

2. Push the resulting file into a Secret:

Use a Secret (not a ConfigMap) — the file contains live access tokens.

3. Inject it as an env var in the Deployment:

The proxy reads the inline JSON at startup and keeps it in an in-memory store. OAuth refresh and dynamic-client registration update the in-memory copy so refreshed tokens stay coherent across requests within the pod's lifetime. Nothing is ever written to disk — one warn log is emitted on the first save attempt. On pod restart, the Secret is re-read and in-memory mutations are discarded.

Refresh strategy. When refresh tokens are about to expire, rotate the Secret externally (sealed-secrets, external-secrets-operator, a CronJob that re-runs mcp add, etc.) and let the rolling update pick it up. The proxy itself is not designed to write back to the Secret.

Limitation. Since MCP_AUTH_CONFIG is read-only, you cannot run mcp add <server> against a running pod and have the registration persist. Always pre-provision the auth store on a workstation or in a one-off Job, then ship it via the Secret.

Pinning the image version

Edit deploy/kubernetes/kustomization.yaml:

Why --insecure?

The proxy refuses to bind non-loopback addresses without --insecure. In Kubernetes, the pod needs 0.0.0.0:8080 so the Service can route traffic to it. TLS termination happens at the Ingress or load balancer level, not at the proxy.

Health probes

The proxy exposes GET /health returning:

Why the probes are configured this way

Startup probe — gives 30s (failureThreshold: 6 * periodSeconds: 5) for the process to start and begin backend discovery. Discovery is async, so the proxy serves immediately but backends connect in the background.

Liveness probe — checks every 30s that the process responds to HTTP. Backend failures are degraded state, not a reason to restart the pod. If sentry is down, the proxy still serves grafana tools fine.

Readiness probe — checks every 10s. The proxy is ready to serve as soon as it starts because it lazy-connects backends on first request. A probe failure here means the process itself is unhealthy.

Do not use backends_connected > 0 as a readiness condition. The proxy is designed to start with zero connections and connect on demand.

Application logs

Application logs (tracing events from the proxy itself — startup, backend discovery, request errors, OAuth flows) go to stderr by default and are captured by the kubelet, so kubectl logs just works. Two env vars tune this for production:

MCP_LOG_LEVEL follows tracing's EnvFilter syntax. Set the global level with info/debug/trace, or scope per module with target=level separated by commas. The example above keeps the proxy at debug while silencing hyper/reqwest/h2 chatter — which dominates request volume otherwise.

MCP_LOG_FORMAT=json swaps the human-readable formatter for newline-delimited JSON. Each line is a complete event with timestamp, level, message, and structured fields. Pair with the audit stream below and you get a single tail-able log surface — every line is JSON, no mixed formats.

Why stderr, not stdout, for app logs? In mcp serve, audit logs go to stdout by default (auto-promotion of file to file+stdout) and they're the structured product surface. Application/tracing logs go to stderr as the diagnostic surface. Kubernetes captures both in the same kubectl logs stream by default — split them downstream with jq (audit lines have method/identity; tracing lines have level/target).

Audit logging

By default, audit logging is disabled (MCP_AUDIT_ENABLED=false) because the scratch-based image has no writable filesystem.

Option A: Stream to stdout (no PVC needed)

Set MCP_AUDIT_OUTPUT=stdout in the Deployment env. Audit entries are emitted as JSON lines to stdout and captured by your cluster's log pipeline (Fluentd, Loki, CloudWatch, etc.). No persistent storage required.

If you want both PVC persistence (queryable via mcp logs in kubectl exec) and the cluster log pipeline, leave MCP_AUDIT_OUTPUT unset — mcp serve --http auto-promotes the default file to file+stdout for exactly this case. Setting MCP_AUDIT_OUTPUT=file+stdout explicitly also works (and is honored verbatim).

Option B: Persist to a PVC

  1. Set MCP_AUDIT_ENABLED=true in the Deployment env

  2. Mount persistent storage at /data:

  1. Uncomment pvc.yaml in kustomization.yaml:

  1. Apply:

Audit logs are written to /data/audit/data and indexed at /data/audit/index (controlled by MCP_AUDIT_PATH and MCP_AUDIT_INDEX_PATH).

Security context

The manifests include a hardened security context:

The image is based on scratch — a static binary with no shell, no package manager, no libc. The process runs as UID 0 by default (the Dockerfile doesn't set USER), but scratch itself does not require root. The attack surface is minimal regardless of UID: no shell to exec into, no tools to exploit, read-only filesystem.

If your cluster policy requires runAsNonRoot: true, set a numeric runAsUser and ensure mounted volumes (/tmp, /data) are writable for that UID — either via fsGroup or an initContainer:

Scaling

Each replica is fully independent — own backend pool, own tool cache, own connections. There's no shared state, no leader election, no coordination needed.

Scaling to N replicas means:

  • N independent connections to each backend

  • N copies of the tool/resource/prompt cache in memory

  • Clients are load-balanced across replicas by the Service

This is fine for most deployments. Be aware that stdio-based backends (which spawn child processes) will have N copies of each process running across the cluster.

Graceful shutdown

When Kubernetes sends SIGTERM (during rolling updates or scale-down):

  1. The proxy stops accepting new connections

  2. In-flight requests finish normally

  3. Backend clients are shut down in parallel (5s timeout each)

  4. Total internal cleanup is bounded to ~10s

terminationGracePeriodSeconds: 30 in the Deployment gives enough headroom. After 30s, Kubernetes sends SIGKILL.

Environment variables reference

Variable
Manifest value
Description

MCP_SERVERS_CONFIG

(from ConfigMap)

Inline JSON config (highest priority)

MCP_AUTH_CONFIG

(from Secret, optional)

Inline OAuth tokens (auth.json). Read-only — writes are no-ops.

MCP_PROXY_REQUEST_TIMEOUT

120 (app default)

Max seconds per JSON-RPC request

MCP_LOG_LEVEL

info

tracing EnvFilter (e.g. mcp=debug,hyper=warn,reqwest=warn,h2=warn)

MCP_LOG_FORMAT

text

json for newline-delimited JSON to stderr (log drivers)

MCP_AUDIT_ENABLED

false

Enable audit logging

MCP_AUDIT_OUTPUT

unset (→ file+stdout in mcp serve --http)

stdout for cluster log pipeline only, file for PVC only (setting this env var is treated as explicit and skips auto-promotion), file+stdout for both PVC and pipeline (the auto-promoted default in serve), none to disable

MCP_AUDIT_PATH

/data/audit/data

Audit data directory (app default: ~/.config/mcp/db/data)

MCP_AUDIT_INDEX_PATH

/data/audit/index

Audit index directory (app default: ~/.config/mcp/db/index)

MCP_CLASSIFIER_CACHE

/tmp/tool-classification.json

Tool classification cache (app default: ~/.config/mcp/tool-classification.json)

Full reference: Environment variables

Exposing outside the cluster

The Service is ClusterIP by default. To expose externally, add an Ingress:

Troubleshooting

Pod starts but backends never connect

Check the ConfigMap config is valid JSON:

Check the proxy logs:

Look for [serve] discovering tools from ... lines. If you see failed to discover, the backend URL or token is wrong.

Health probe fails on startup

Increase the startup probe threshold:

Token not resolving

Ensure the Secret key matches what the Deployment env references, and that the ${VAR_NAME} in the ConfigMap matches the env var name exactly. If a referenced env var is missing, the placeholder is replaced with an empty string silently — verify the resolved config by checking the proxy logs for authentication failures on backend connections.

Read-only filesystem errors

If you see permission errors, make sure the tmp and data volumes are mounted. The scratch image has no writable paths without explicit volume mounts.

Last updated

Was this helpful?