Skip to Content
DeploymentDocker & Helm

Docker & Helm Deployment

Production-ready deployment using Docker Compose or Helm on Kubernetes. Covers the official image, configuration, persistence, monitoring, ingress, and security hardening.

The official image is available at ghcr.io/anthropics/pilot. It bundles the Pilot binary, Claude Code CLI, Git, and the GitHub CLI (gh) in a single Ubuntu-based image running as a non-root user.


Quick Start

The fastest way to run Pilot in a container:

Copy the example config

cp configs/pilot.example.yaml config.yaml # Edit config.yaml — set your repo, project_path, and adapter settings

Set environment variables

export GITHUB_TOKEN="your-github-pat" export ANTHROPIC_API_KEY="your-anthropic-key"

Start with Docker Compose

docker compose up -d docker compose logs -f pilot

Pilot starts polling for issues labeled pilot on the configured repository within 30 seconds.


Docker Image

Pull from GHCR

# Latest stable release docker pull ghcr.io/anthropics/pilot:latest # Pin to a specific version (recommended for production) docker pull ghcr.io/anthropics/pilot:v2.56.0

Build from Source

# Build with version metadata docker build \ --build-arg VERSION=$(git describe --tags --always) \ --build-arg BUILD_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) \ -t pilot:local .

Multi-architecture build (amd64 + arm64):

docker buildx build \ --platform linux/amd64,linux/arm64 \ --build-arg VERSION=$(git describe --tags --always) \ -t ghcr.io/anthropics/pilot:latest \ --push .

Image Contents

The runtime image is based on Ubuntu 22.04 (not Alpine — Claude Code requires Node.js and system libraries that Alpine cannot provide):

ComponentVersionPurpose
Pilot binaryrelease tagMain process
Claude Code CLIlatestAI execution backend
Git + gh CLIsystemRepository operations
Node.js + npmsystemClaude Code runtime

The binary runs as user pilot (UID 1000). The container exposes port 9090 for the gateway HTTP server.

Do not override USER in your Compose or Helm values. Running Pilot as root is unsupported and disables non-root security policies.


Docker Compose

Minimal Setup

The docker-compose.yml in the project root is ready to use:

services: pilot: build: context: . args: VERSION: ${VERSION:-dev} BUILD_TIME: ${BUILD_TIME:-} image: pilot:${VERSION:-dev} ports: - "9090:9090" volumes: # Persistent SQLite data — required across restarts - pilot-data:/home/pilot/.pilot/data # Mount your config file - ./config.yaml:/home/pilot/.pilot/config.yaml:ro environment: - ANTHROPIC_API_KEY - GITHUB_TOKEN command: ["start", "--github", "--autopilot=dev"] restart: unless-stopped healthcheck: test: ["CMD", "curl", "-sf", "http://localhost:9090/health"] interval: 30s timeout: 5s start_period: 15s retries: 3 volumes: pilot-data:

Full Setup with All Adapters

For production use with Telegram, Slack, and multiple adapters:

services: pilot: image: ghcr.io/anthropics/pilot:v2.56.0 ports: - "9090:9090" environment: # Required - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - GITHUB_TOKEN=${GITHUB_TOKEN} # Optional adapters - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN} - SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN} - LINEAR_API_KEY=${LINEAR_API_KEY} - JIRA_API_TOKEN=${JIRA_API_TOKEN} # Optional LLM features - OPENAI_API_KEY=${OPENAI_API_KEY} volumes: - pilot-data:/home/pilot/.pilot/data - ./config.yaml:/home/pilot/.pilot/config.yaml:ro - ./gitconfig:/home/pilot/.gitconfig:ro # optional: git identity command: ["start", "--github", "--telegram", "--autopilot=stage"] restart: unless-stopped healthcheck: test: ["CMD", "curl", "-sf", "http://localhost:9090/health"] interval: 30s timeout: 5s start_period: 15s retries: 3 volumes: pilot-data: driver: local

Store secrets in a .env file (never commit it):

# .env ANTHROPIC_API_KEY=sk-ant-... GITHUB_TOKEN=ghp_... TELEGRAM_BOT_TOKEN=... SLACK_BOT_TOKEN=xoxb-...

Common Commands

# Start in background docker compose up -d # Follow logs docker compose logs -f # Restart after config change docker compose restart pilot # Stop and remove containers (data volume preserved) docker compose down # Full teardown including data volume docker compose down -v

Helm Chart Installation

The Helm chart is included in the repository at helm/pilot/. It deploys a single-replica Deployment, Service, ConfigMap, Secret, and PVC.

Prerequisites

# Add helm repository (if published) or clone the repo git clone https://github.com/anthropics/pilot cd pilot

Install

helm install pilot ./helm/pilot \ --set secrets.githubToken="ghp_..." \ --set secrets.anthropicApiKey="sk-ant-..." \ --set config.adapters.github.repo="your-org/your-repo"
# Create secrets separately (recommended) kubectl create secret generic pilot-secrets \ --from-literal=github-token="ghp_..." \ --from-literal=anthropic-api-key="sk-ant-..." # Install referencing existing secret helm install pilot ./helm/pilot \ --set existingSecret=pilot-secrets \ --set config.adapters.github.repo="your-org/your-repo"
helm install pilot ./helm/pilot \ --namespace pilot --create-namespace \ --values values.production.yaml \ --set secrets.githubToken="ghp_..." \ --set secrets.anthropicApiKey="sk-ant-..."

Upgrade

helm upgrade pilot ./helm/pilot --reuse-values \ --set image.tag=v2.56.0

values.yaml Reference

# Image image: repository: ghcr.io/anthropics/pilot tag: v2.56.0 # pin to a specific version in production pullPolicy: IfNotPresent # Replica count — always 1 (SQLite constraint) replicaCount: 1 # Deployment strategy — Recreate ensures clean shutdown before pod restart strategy: type: Recreate # Service service: type: ClusterIP port: 9090 # Ingress — enable for webhook reception ingress: enabled: false className: nginx host: pilot.example.com tls: true # Resource requests and limits resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "1Gi" cpu: "1000m" # Persistence — required for SQLite state persistence: enabled: true size: 1Gi storageClass: "" # use cluster default accessMode: ReadWriteOnce # Pilot config — rendered into a ConfigMap config: gateway: host: "0.0.0.0" # required in container: listen on all interfaces port: 9090 adapters: github: enabled: true repo: "your-org/your-repo" autopilot: enabled: true auto_merge: false # Secrets — injected as env vars secrets: githubToken: "" anthropicApiKey: "" telegramBotToken: "" slackBotToken: "" # Reference an existing Kubernetes Secret instead of creating one existingSecret: "" # Prometheus ServiceMonitor serviceMonitor: enabled: false interval: 30s # Pod security context podSecurityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 1000

Override Examples

# Change image tag helm upgrade pilot ./helm/pilot --set image.tag=v2.56.0 # Enable ingress helm upgrade pilot ./helm/pilot \ --set ingress.enabled=true \ --set ingress.host=pilot.mycompany.com # Scale up persistence helm upgrade pilot ./helm/pilot --set persistence.size=5Gi # Enable Prometheus ServiceMonitor helm upgrade pilot ./helm/pilot --set serviceMonitor.enabled=true

Configuration

config.yaml in Container

Mount your config.yaml as a read-only volume. In Docker Compose:

volumes: - ./config.yaml:/home/pilot/.pilot/config.yaml:ro

In Kubernetes (via ConfigMap):

apiVersion: v1 kind: ConfigMap metadata: name: pilot-config data: config.yaml: | version: "1.0" gateway: host: "0.0.0.0" # must be 0.0.0.0, not 127.0.0.1 port: 9090 adapters: github: enabled: true token: "${GITHUB_TOKEN}" repo: "your-org/your-repo" project_path: "/workspace" autopilot: enabled: true auto_merge: true

Gateway host must be 0.0.0.0 in containers. The default 127.0.0.1 binds to loopback only — health checks and ingress traffic will not reach the process.

Environment Variables

All sensitive values should be injected as environment variables rather than embedded in config.yaml:

VariableDescription
ANTHROPIC_API_KEYClaude API key for execution
GITHUB_TOKENGitHub PAT with repo + workflow scopes
TELEGRAM_BOT_TOKENTelegram bot token
SLACK_BOT_TOKENSlack bot token
LINEAR_API_KEYLinear API key
JIRA_API_TOKENJira API token
OPENAI_API_KEYOpenAI key for voice transcription

Reference them in config.yaml using ${VAR_NAME} syntax:

adapters: github: token: "${GITHUB_TOKEN}"

Git Identity

Pilot creates commits when implementing tasks. Configure git identity either in config.yaml or by mounting a .gitconfig:

# docker-compose.yml volumes: - ./gitconfig:/home/pilot/.gitconfig:ro
# gitconfig [user] name = Pilot Bot email = pilot@yourcompany.com

Persistence

SQLite Volume

Pilot uses SQLite for all state: task queue, execution history, memory, and autopilot state. Without a persistent volume, all state is lost on restart.

Docker Compose — named volume:

volumes: - pilot-data:/home/pilot/.pilot/data

Kubernetes — PersistentVolumeClaim:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pilot-data spec: accessModes: - ReadWriteOnce # SQLite requires single-writer access resources: requests: storage: 1Gi storageClass: standard

ReadWriteOnce means the PVC can only be mounted by a single node at a time. This is correct for Pilot — do not use ReadWriteMany.

Single-Replica Constraint

Pilot is designed for single-instance operation. Running multiple replicas causes:

  • SQLite write lock contention (WAL mode helps but does not eliminate conflicts)
  • Duplicate task processing (both replicas pick the same issue)
  • Split-brain autopilot state

Always use replicas: 1 and strategy: Recreate:

spec: replicas: 1 strategy: type: Recreate # ensures old pod terminates before new one starts

Do not configure HPA or KEDA for scale-out.

Backup Strategy

Back up the SQLite database file at /home/pilot/.pilot/data/pilot.db:

# Manual backup kubectl exec deploy/pilot -- \ sqlite3 /home/pilot/.pilot/data/pilot.db ".backup '/tmp/pilot-backup.db'" kubectl cp pilot-pod:/tmp/pilot-backup.db ./pilot-backup-$(date +%Y%m%d).db # CronJob backup to object storage (example with AWS S3)
apiVersion: batch/v1 kind: CronJob metadata: name: pilot-db-backup spec: schedule: "0 2 * * *" # 2 AM daily jobTemplate: spec: template: spec: containers: - name: backup image: amazon/aws-cli command: - /bin/sh - -c - | sqlite3 /data/pilot.db ".backup '/tmp/backup.db'" && \ aws s3 cp /tmp/backup.db s3://your-bucket/pilot/pilot-$(date +%Y%m%d).db volumeMounts: - name: data mountPath: /data readOnly: true volumes: - name: data persistentVolumeClaim: claimName: pilot-data restartPolicy: OnFailure

Monitoring

Prometheus Metrics

Pilot exposes Prometheus metrics at GET /metrics. Enable scraping:

Prometheus scrape_configs:

scrape_configs: - job_name: 'pilot' static_configs: - targets: ['pilot:9090'] metrics_path: /metrics scrape_interval: 30s

Kubernetes ServiceMonitor (requires Prometheus Operator):

apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: pilot labels: release: prometheus # match your Prometheus Operator release label spec: selector: matchLabels: app: pilot endpoints: - port: http path: /metrics interval: 30s

Enable via Helm:

helm upgrade pilot ./helm/pilot --set serviceMonitor.enabled=true

Key Metrics

MetricTypeDescription
pilot_issues_processed_totalCounterIssues processed by result
pilot_prs_merged_totalCounterPRs successfully merged
pilot_queue_depthGaugeIssues waiting in queue
pilot_success_rateGaugeRolling success rate (0–1)
pilot_execution_duration_secondsHistogramTask execution duration

Grafana Dashboard

Suggested panels for a Pilot dashboard:

# Issue throughput rate(pilot_issues_processed_total[5m]) # Success rate (alert if < 0.9) pilot_success_rate # Queue depth pilot_queue_depth # P95 execution time histogram_quantile(0.95, rate(pilot_execution_duration_seconds_bucket[5m]))

See Monitoring for the full metrics reference and alerting rules.


Ingress

Configure ingress to receive webhooks from GitHub, Linear, and Jira:

apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: pilot annotations: nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/proxy-body-size: "1m" spec: ingressClassName: nginx tls: - hosts: - pilot.example.com secretName: pilot-tls rules: - host: pilot.example.com http: paths: - path: / pathType: Prefix backend: service: name: pilot port: number: 9090

Enable via Helm values:

ingress: enabled: true className: nginx host: pilot.example.com tls: true annotations: nginx.ingress.kubernetes.io/ssl-redirect: "true"

Webhook URLs

After ingress is configured, set these webhook URLs in each service:

ServiceWebhook URLEvents
GitHubhttps://pilot.example.com/webhooks/githubIssues, Pull requests
Linearhttps://pilot.example.com/webhooks/linearIssues
Jirahttps://pilot.example.com/webhooks/jiraIssues
GitLabhttps://pilot.example.com/webhooks/gitlabIssues, Merge requests

Set a webhook secret in your config for HMAC verification:

adapters: github: webhook_secret: "${GITHUB_WEBHOOK_SECRET}"

Without ingress, Pilot falls back to polling (every 30s by default). Polling works but adds latency compared to instant webhook delivery.


Security

Non-Root Execution

The official image runs as pilot (UID 1000). The Helm chart enforces this via podSecurityContext:

podSecurityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 1000 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: false # Claude Code writes temp files capabilities: drop: ["ALL"]

readOnlyRootFilesystem: true is not supported — Claude Code and git write temporary files during task execution.

Network Policies

Restrict Pilot’s network access to only required egress:

apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: pilot-egress spec: podSelector: matchLabels: app: pilot policyTypes: - Egress egress: # GitHub API - to: - ipBlock: cidr: 0.0.0.0/0 # GitHub uses many IPs; restrict further if you have a static proxy ports: - protocol: TCP port: 443 # Anthropic API - to: - ipBlock: cidr: 0.0.0.0/0 ports: - protocol: TCP port: 443 # DNS - ports: - protocol: UDP port: 53

Secret Management

Option 1: External Secrets Operator (recommended for production)

apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: pilot-secrets spec: refreshInterval: 1h secretStoreRef: name: vault-backend kind: ClusterSecretStore target: name: pilot-secrets data: - secretKey: github-token remoteRef: key: pilot/github property: token - secretKey: anthropic-api-key remoteRef: key: pilot/anthropic property: api_key

Option 2: Sealed Secrets

# Encrypt with kubeseal kubectl create secret generic pilot-secrets \ --from-literal=github-token="ghp_..." \ --from-literal=anthropic-api-key="sk-ant-..." \ --dry-run=client -o yaml \ | kubeseal --format yaml > sealed-pilot-secrets.yaml # Apply sealed secret (safe to commit) kubectl apply -f sealed-pilot-secrets.yaml

Reference from Helm:

helm install pilot ./helm/pilot --set existingSecret=pilot-secrets

Troubleshooting

Port binding: address already in use

Pilot’s gateway binds to host:port from config. In containers, the default 127.0.0.1 only accepts loopback traffic — health checks from the kubelet will fail.

Fix: Set gateway.host: "0.0.0.0" in config.yaml.

gateway: host: "0.0.0.0" port: 9090

SQLite database is locked

Cause: Multiple processes attempting to write simultaneously, or a previous process did not release the lock cleanly.

Fix:

  1. Ensure replicas: 1 and strategy: Recreate — the old pod must terminate before the new one starts.
  2. If the lock persists, restart the pod: kubectl rollout restart deploy/pilot.
  3. For data integrity, restore from a backup rather than deleting the lock file.

Claude Code CLI not found

The official image includes Claude Code. This error typically means you are using a custom image or mounting a binary that does not include it.

Verify:

docker exec pilot claude --version # or in Kubernetes: kubectl exec deploy/pilot -- claude --version

Fix: Use the official image ghcr.io/anthropics/pilot or add to your Dockerfile:

RUN npm install -g @anthropic-ai/claude-code

Health check fails at startup

Pilot takes 10–15 seconds to start up (Claude Code + git initialization). The Dockerfile and Helm chart both configure start_period: 15s / initialDelaySeconds: 15 to avoid false failures.

If health checks fail beyond startup:

# Check logs kubectl logs deploy/pilot --tail=50 # Check the health endpoint directly kubectl port-forward svc/pilot 9090:9090 curl http://localhost:9090/health curl http://localhost:9090/ready

Webhook deliveries not received

  1. Verify ingress is configured and DNS resolves: curl https://pilot.example.com/health
  2. Check the webhook secret matches on both sides
  3. Confirm GitHub/Linear/Jira webhook logs show 200 responses
  4. Check Pilot logs: kubectl logs deploy/pilot | grep webhook

Without ingress, switch to polling:

adapters: github: polling: enabled: true interval: 30s