Skip to Content
ConceptsArchitecture

Architecture & Execution Flow

Pilot is a Go-based autonomous development pipeline. This page covers the system architecture, the full execution flow from issue to merged PR, component interactions, and the autopilot state machine.

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ CLI (cmd/pilot) β”‚ β”‚ start | task | github | telegram | ... β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ internal/pilot β”‚ β”‚ Orchestration + Component Coordination β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Adapters β”‚ β”‚ Executor β”‚ β”‚ Memory β”‚ β”‚ Gateway β”‚ β”‚ telegram/github β”‚ β”‚ Claude Code Runner β”‚ β”‚ SQLite + Graph β”‚ β”‚ HTTP + WS β”‚ β”‚ linear/jira/ β”‚ β”‚ Progress Display β”‚ β”‚ Patterns Store β”‚ β”‚ Webhooks β”‚ β”‚ slack/discord/ β”‚ β”‚ Git Operations β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ plane/gitlab β”‚ β”‚ Quality Gates β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Alerts Integration β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

37 packages across four layers:

LayerPackagesRole
Corepilot, executor, config, memory, logging, alerts, quality, dashboard, briefs, replay, upgradeExecution engine, state, observability
Adapterstelegram, github, slack, linear, jira, discord, plane, gitlab, azure-devops, asanaInput sources and notification channels
Supportinggateway, orchestrator, approval, budget, teams, tunnel, webhooks, healthInfrastructure, cost control, permissions
TesttestutilFake tokens, test helpers

Execution Flow: Issue β†’ PR

This is the primary data flow β€” the path a GitHub issue takes from creation to merged PR.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Telegram β”‚ β”‚ GitHub Issue β”‚ β”‚ Linear/Jira β”‚ β”‚ Message β”‚ β”‚ (label:pilot)β”‚ β”‚ Webhook β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Dispatcher β”‚ Per-project queue β”‚ β”‚ Sequential by default β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Model Routing β”‚ Classify complexity: β”‚ β”‚ trivial β†’ Haiku β”‚ β”‚ simple/medium β†’ Sonnet 4.6 β”‚ β”‚ complex β†’ Opus 4.6 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Epic Detection β”‚ Is this too big for one PR? β”‚ β”‚ Yes β†’ decompose into 3-5 subtasks β”‚ β”‚ No β†’ continue β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Executor β”‚ Claude Code subprocess β”‚ runner.Execute β”‚ stream-json output parsing β”‚ β”‚ Progress tracking + alerts β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Quality Gates β”‚ Build β†’ Test β†’ Lint β†’ Coverage β”‚ β”‚ Failed? β†’ retry with error feedback β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Self-Review β”‚ Claude reviews its own diff β”‚ β”‚ May push additional commits β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Git Push + PR β”‚ Branch: pilot/GH-{number} β”‚ Creation β”‚ PR linked to issue β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Autopilot β”‚ CI monitoring β†’ merge β†’ release β”‚ State Machine β”‚ (see Autopilot section below) β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step-by-Step Walkthrough

1. Input Ingestion

Issues enter Pilot through five channels:

  • GitHub Poller β€” Polls every 30 seconds for open issues labeled pilot. The primary production intake.
  • Telegram Bot β€” Messages classified by intent (task, question, research, plan, chat). Tasks become issues.
  • Discord Bot β€” Listens for commands in configured channels. Tasks are created as GitHub issues.
  • Plane Poller β€” Polls Plane project boards for issues assigned to the Pilot actor.
  • Webhooks β€” Linear, Jira, and GitHub webhook events via the Gateway HTTP server.

2. Dispatcher Serialization

The Dispatcher (internal/executor/dispatcher.go) maintains a per-project task queue. In sequential mode (default), only one task runs per project at a time. This prevents merge conflicts and ensures each PR is based on the latest main.

// Per-project serialization prevents conflicts dispatcher.Enqueue(projectPath, task) // Next task starts only after current PR merges

3. Complexity Detection & Model Routing

Before execution, Pilot classifies the task:

ComplexityModelTimeoutCost Range
TrivialHaiku5m~$0.02
SimpleSonnet 4.615m~$0.20
MediumSonnet 4.630m~$0.75
ComplexOpus 4.660m~$3.00

Trivial tasks skip context intelligence entirely. Simple through complex tasks get full codebase context.

4. Epic Decomposition

If the task is too large for a single PR, the epic detector triggers. Detection uses structural signals β€” phase headers, checkboxes, word count above 100 β€” not just keyword matching.

When triggered, Pilot:

  1. Sends the issue to Claude for planning
  2. Parses the response into 3–5 sequential subtasks
  3. Creates child issues, each labeled pilot
  4. Marks the parent issue as pilot-in-progress
  5. Subtasks are picked up by the poller sequentially

5. Execution (Claude Code Subprocess)

The executor spawns Claude Code as a subprocess:

cmd := exec.Command("claude", "-p", prompt, "--verbose", "--output-format", "stream-json", "--dangerously-skip-permissions", )

Output is parsed as stream-json events:

  • system β€” initialization, session info
  • assistant β€” Claude’s text responses
  • tool_use β€” tool invocations (Read, Write, Bash, Grep, etc.)
  • tool_result β€” tool outputs
  • result β€” final outcome with token counts

Progress is tracked through phase detection keywords in the output stream:

Navigator Session Started β†’ navigator TASK MODE ACTIVATED β†’ task-mode PHASE: β†’ RESEARCH β†’ research PHASE: β†’ IMPL β†’ implementing PHASE: β†’ VERIFY β†’ verifying

6. Quality Gates

After Claude Code completes, quality gates run in sequence:

GateDefault TimeoutPurpose
build5mCompilation check
test10mUnit/integration tests
lint5mCode style enforcement
coverage5mCoverage threshold
security5mSecurity scanning
typecheck5mType checking

Each gate supports retries with configurable delay. On failure, the error output is fed back to Claude for automatic correction. Gate detection is automatic β€” Pilot infers build commands from the project type (Go, Node, Rust, Python).

7. Self-Review

Claude reviews its own diff before pushing. This catches issues that quality gates miss: naming inconsistencies, dead code, missing error handling. Self-review may push additional commits, which is why the commit SHA is refreshed from the GitHub API after PR creation.

8. PR Creation

Pilot pushes the branch (pilot/GH-{issue_number}) and creates a PR via the GitHub API. The PR body references the original issue and includes execution metadata.

Autopilot State Machine

Once a PR exists, the autopilot controller manages its lifecycle through 10 stages:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PR Created β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Waiting CI β”‚ ← Poll every 30s β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ Refresh HeadSHA from API β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ CI Passed β”‚ β”‚ CI Failed β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Feedback Loop β”‚ β”‚ env? β”‚ β”‚ Create fix issueβ”‚ β”‚ β”‚ β”‚ on same branch β”‚ β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ dev/ β”‚ β”‚ prod β”‚ β”‚ β”‚ stage β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ New PR β”‚ β”‚ β”‚ β”‚ (same branch)β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ Awaiting β”‚ β”‚ β”‚ Approval β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ Human approves β”‚ β”‚ (Telegram/Slack/GitHub Review) β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Merging β”‚ ← Squash merge (default) β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Merged β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Post-Merge CI β”‚ ← Optional: wait for main CI β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Releasing β”‚ ← Optional: auto-release β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Stage Definitions

StageDescriptionNext
pr_createdPR exists, entering pipelinewaiting_ci
waiting_ciPolling CI status every 30sci_passed or ci_failed
ci_passedAll required checks greenmerging (dev/stage) or awaiting_approval (prod)
ci_failedCI check failedFeedback loop β†’ new PR on same branch
awaiting_approvalProd mode: waiting for humanmerging on approval
mergingSquash merge in progressmerged
mergedPR merged to mainpost_merge_ci or done
post_merge_ciWaiting for main branch CIreleasing or done
releasingCreating GitHub releaseDone
failedTerminal state, requires interventionβ€”

Environment Modes

EnvironmentAuto-MergeApprovalCI WaitUse Case
devYesNone5m timeoutPersonal projects, fast iteration
stageYesNone30m timeoutTeam staging, balanced safety
prodNoRequired30m timeoutProduction, maximum control

Safety Controls

  • Circuit breaker: Pauses after max_failures consecutive failures (default: 3)
  • Merge rate limit: max_merges_per_hour prevents runaway automation (default: 10)
  • Approval timeout: Prod mode times out after 1 hour without approval
  • Budget limits: Per-task and daily spending caps with warn/pause/stop actions

Component Interaction Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ cmd/pilot/main.go β”‚ β”‚ Parses CLI flags, loads config, wires components, starts adapters β”‚ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ GitHub β”‚ β”‚ Telegram β”‚ β”‚ Executor β”‚ β”‚ Autopilot β”‚ β”‚ Poller β”‚ β”‚ Bot β”‚ β”‚ β”‚ β”‚ Controller β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ Issues β†’ β”‚ β”‚ Msgs β†’ β”‚ β”‚ β”‚Runner β”‚ β”‚ β”‚ PR lifecycle β”‚ β”‚ Executor β”‚ β”‚ Executor β”‚ β”‚ β”‚Backend β”‚ β”‚ β”‚ CI monitor β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚Quality β”‚ β”‚ β”‚ Auto-merger β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚Alerts β”‚ β”‚ β”‚ Feedback loopβ”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Discord β”‚ β”‚ Plane β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ Bot β”‚ β”‚ Poller β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Cmds β†’ β”‚ β”‚ Issues β†’ β”‚ β”‚ β”‚ β”‚ Executor β”‚ β”‚ Executor β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Memory β”‚ β”‚ Alerts β”‚ β”‚ (SQLite) β”‚ β”‚ Engine β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ Patterns β”‚ β”‚ Slack β”‚ β”‚ History β”‚ β”‚ Telegram β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow Through Key Structs

Task (input) ExecutionResult (output) β”œβ”€ Title, Body β”œβ”€ Success bool β”œβ”€ ProjectPath β”œβ”€ CommitSHA β”œβ”€ BranchName β”œβ”€ BranchName β”œβ”€ IssueNumber β”œβ”€ PRNumber, PRURL β”œβ”€ Images []string β”œβ”€ TokensIn, TokensOut └─ Labels []string β”œβ”€ Cost └─ Duration β”‚ β”‚ β”‚ runner.Execute() β”‚ └──────────────────────────────▢│ β”‚ β–Ό PRState (autopilot) β”œβ”€ PRNumber β”œβ”€ HeadSHA (refreshed from API) β”œβ”€ BranchName β”œβ”€ Stage (PRStage enum) β”œβ”€ CIStatus └─ MergeAttempts

Context Intelligence

Pilot’s context intelligence layer provides structured codebase context. When Pilot detects a .agent/ directory in the project, it activates the context engine by injecting a session initialization command into the prompt.

Pilot Prompt Construction: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ System instructions β”‚ ← Model routing, task constraints β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Issue title + body β”‚ ← The actual work to do β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ "Start my Navigator β”‚ ← Activates context engine β”‚ session." β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Branch + PR metadata β”‚ ← Git context β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Budget constraints β”‚ ← Token/cost limits β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Context engine phases detected in the execution stream:

PhaseSignalDescription
navigatorNavigator Session StartedContext engine loaded, docs indexed
task-modeTASK MODE ACTIVATEDTask planning begins
researchPHASE: β†’ RESEARCHCodebase exploration
implementingPHASE: β†’ IMPLWriting code
verifyingPHASE: β†’ VERIFYRunning tests, checking output
completeCompletion signalTask finished

Context intelligence is Pilot’s core value proposition. The prompt injection (Start my Navigator session.) must never be removed β€” it’s what gives Pilot deep codebase understanding rather than generic code generation.

CI Feedback Loop

When CI fails on a Pilot PR, the feedback loop activates:

CI Failure Detected β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Close failed PR β”‚ ← Unblocks sequential poller β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Create fix issue β”‚ ← Title: "Fix CI: {original title}" β”‚ with metadata β”‚ ← Body: <!-- autopilot-meta branch:pilot/GH-123 --> β”‚ β”‚ ← Labels: pilot, autopilot-fix β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Poller picks up β”‚ ← parseAutopilotBranch() reads metadata β”‚ fix issue β”‚ ← Checks out SAME branch, not new one β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Execute fix β”‚ ← Claude sees CI error logs β”‚ on original branch β”‚ ← Pushes fix commits β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New PR from same β”‚ ← Same branch, fresh PR β”‚ branch β”‚ ← Re-enters autopilot pipeline β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The branch metadata comment (<!-- autopilot-meta branch:X -->) ensures the fix happens on the original branch rather than creating a new one.

Budget Enforcement

Budget limits are enforced at two levels:

LevelLimitsAction on Exceed
Per-taskToken count, durationStop current execution
Daily/MonthlyUSD spending capWarn at 80%, pause new tasks, or stop all

Actions escalate through three levels:

  • warn β€” Notify but continue execution
  • pause β€” Finish current task, block new ones
  • stop β€” Terminate immediately

Budget status is checked before each task dispatch and during execution via the alert bridge.

Configuration Hierarchy

# ~/.pilot/config.yaml gateway: # HTTP + WebSocket server binding adapters: # Input sources (telegram, github, slack, linear, jira, discord, plane) github: polling: interval: 30s label: "pilot" discord: bot_token: "" guild_id: "" channel_ids: [] plane: api_url: "" api_token: "" project_board: "" orchestrator: # Execution mode, autopilot config execution_mode: sequential # sequential | parallel | auto autopilot: environment: stage # dev | stage | prod auto_merge: true ci_poll_interval: 30s max_failures: 3 # Circuit breaker threshold executor: # Backend, model routing, decomposition backend: claude-code model_routing: trivial: claude-haiku-4-5-20251001 simple: claude-sonnet-4-6 medium: claude-sonnet-4-6 complex: claude-opus-4-6 memory: # SQLite persistence quality: # Gate configuration (build, test, lint, coverage) alerts: # Event routing to notification channels budget: # Cost controls and limits projects: # Multi-project configuration

Database Schema

Pilot persists execution history and learned patterns in SQLite (WAL mode for concurrency):

-- Task execution history executions (id, task_id, project_path, status, started_at, completed_at, duration_ms, output, error, commit_sha, pr_url) -- Cross-project patterns (learned behaviors) cross_patterns (id, title, description, type, scope, confidence, occurrences, is_anti_pattern) -- Task queue for dispatcher serialization task_queue (id, project_path, task_json, status, created_at, started_at, completed_at)

What’s Next

This architecture supports the following planned documentation pages:

  • Model Routing β€” Deep dive into complexity classification and model selection
  • Security Model β€” Sandbox mode, webhook validation, token handling, trust boundaries