Architecture & Execution Flow
Pilot is a Go-based autonomous development pipeline. This page covers the system architecture, the full execution flow from issue to merged PR, component interactions, and the autopilot state machine.
High-Level Architecture
βββββββββββββββββββββββββββββββββββββββββββ
β CLI (cmd/pilot) β
β start | task | github | telegram | ... β
βββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββ
β internal/pilot β
β Orchestration + Component Coordination β
ββββββββββββ¬ββββββββββββββββββ¬βββββββββββββ
β β
ββββββββββββββββββββββββΌββββββββββββββββββΌββββββββββββββββββββββββ
β β β β
ββββββββββΌβββββββββ ββββββββββββΌβββββββββββ ββββΌββββββββββββββ ββββββββΌβββββββ
β Adapters β β Executor β β Memory β β Gateway β
β telegram/github β β Claude Code Runner β β SQLite + Graph β β HTTP + WS β
β linear/jira/ β β Progress Display β β Patterns Store β β Webhooks β
β slack/discord/ β β Git Operations β ββββββββββββββββββ βββββββββββββββ
β plane/gitlab β β Quality Gates β
βββββββββββββββββββ β Alerts Integration β
βββββββββββββββββββββββ37 packages across four layers:
| Layer | Packages | Role |
|---|---|---|
| Core | pilot, executor, config, memory, logging, alerts, quality, dashboard, briefs, replay, upgrade | Execution engine, state, observability |
| Adapters | telegram, github, slack, linear, jira, discord, plane, gitlab, azure-devops, asana | Input sources and notification channels |
| Supporting | gateway, orchestrator, approval, budget, teams, tunnel, webhooks, health | Infrastructure, cost control, permissions |
| Test | testutil | Fake tokens, test helpers |
Execution Flow: Issue β PR
This is the primary data flow β the path a GitHub issue takes from creation to merged PR.
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Telegram β β GitHub Issue β β Linear/Jira β
β Message β β (label:pilot)β β Webhook β
ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β β
ββββββββββββββββββββββΌββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Dispatcher β Per-project queue
β β Sequential by default
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Model Routing β Classify complexity:
β β trivial β Haiku
β β simple/medium β Sonnet 4.6
β β complex β Opus 4.6
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Epic Detection β Is this too big for one PR?
β β Yes β decompose into 3-5 subtasks
β β No β continue
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Executor β Claude Code subprocess
β runner.Execute β stream-json output parsing
β β Progress tracking + alerts
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Quality Gates β Build β Test β Lint β Coverage
β β Failed? β retry with error feedback
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Self-Review β Claude reviews its own diff
β β May push additional commits
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Git Push + PR β Branch: pilot/GH-{number}
β Creation β PR linked to issue
ββββββββββ¬βββββββββ
β
ββββββββββΌβββββββββ
β Autopilot β CI monitoring β merge β release
β State Machine β (see Autopilot section below)
βββββββββββββββββββStep-by-Step Walkthrough
1. Input Ingestion
Issues enter Pilot through five channels:
- GitHub Poller β Polls every 30 seconds for open issues labeled
pilot. The primary production intake. - Telegram Bot β Messages classified by intent (task, question, research, plan, chat). Tasks become issues.
- Discord Bot β Listens for commands in configured channels. Tasks are created as GitHub issues.
- Plane Poller β Polls Plane project boards for issues assigned to the Pilot actor.
- Webhooks β Linear, Jira, and GitHub webhook events via the Gateway HTTP server.
2. Dispatcher Serialization
The Dispatcher (internal/executor/dispatcher.go) maintains a per-project task queue. In sequential mode (default), only one task runs per project at a time. This prevents merge conflicts and ensures each PR is based on the latest main.
// Per-project serialization prevents conflicts
dispatcher.Enqueue(projectPath, task)
// Next task starts only after current PR merges3. Complexity Detection & Model Routing
Before execution, Pilot classifies the task:
| Complexity | Model | Timeout | Cost Range |
|---|---|---|---|
| Trivial | Haiku | 5m | ~$0.02 |
| Simple | Sonnet 4.6 | 15m | ~$0.20 |
| Medium | Sonnet 4.6 | 30m | ~$0.75 |
| Complex | Opus 4.6 | 60m | ~$3.00 |
Trivial tasks skip context intelligence entirely. Simple through complex tasks get full codebase context.
4. Epic Decomposition
If the task is too large for a single PR, the epic detector triggers. Detection uses structural signals β phase headers, checkboxes, word count above 100 β not just keyword matching.
When triggered, Pilot:
- Sends the issue to Claude for planning
- Parses the response into 3β5 sequential subtasks
- Creates child issues, each labeled
pilot - Marks the parent issue as
pilot-in-progress - Subtasks are picked up by the poller sequentially
5. Execution (Claude Code Subprocess)
The executor spawns Claude Code as a subprocess:
cmd := exec.Command("claude",
"-p", prompt,
"--verbose",
"--output-format", "stream-json",
"--dangerously-skip-permissions",
)Output is parsed as stream-json events:
systemβ initialization, session infoassistantβ Claudeβs text responsestool_useβ tool invocations (Read, Write, Bash, Grep, etc.)tool_resultβ tool outputsresultβ final outcome with token counts
Progress is tracked through phase detection keywords in the output stream:
Navigator Session Started β navigator
TASK MODE ACTIVATED β task-mode
PHASE: β RESEARCH β research
PHASE: β IMPL β implementing
PHASE: β VERIFY β verifying6. Quality Gates
After Claude Code completes, quality gates run in sequence:
| Gate | Default Timeout | Purpose |
|---|---|---|
build | 5m | Compilation check |
test | 10m | Unit/integration tests |
lint | 5m | Code style enforcement |
coverage | 5m | Coverage threshold |
security | 5m | Security scanning |
typecheck | 5m | Type checking |
Each gate supports retries with configurable delay. On failure, the error output is fed back to Claude for automatic correction. Gate detection is automatic β Pilot infers build commands from the project type (Go, Node, Rust, Python).
7. Self-Review
Claude reviews its own diff before pushing. This catches issues that quality gates miss: naming inconsistencies, dead code, missing error handling. Self-review may push additional commits, which is why the commit SHA is refreshed from the GitHub API after PR creation.
8. PR Creation
Pilot pushes the branch (pilot/GH-{issue_number}) and creates a PR via the GitHub API. The PR body references the original issue and includes execution metadata.
Autopilot State Machine
Once a PR exists, the autopilot controller manages its lifecycle through 10 stages:
βββββββββββββββ
β PR Created β
ββββββββ¬βββββββ
β
ββββββββΌβββββββ
β Waiting CI β β Poll every 30s
ββββββββ¬βββββββ Refresh HeadSHA from API
β
βββββββββββ΄ββββββββββ
β β
ββββββββΌβββββββ ββββββββΌβββββββ
β CI Passed β β CI Failed β
ββββββββ¬βββββββ ββββββββ¬βββββββ
β β
ββββββ΄βββββ ββββββΌβββββββββββββ
β β β Feedback Loop β
β env? β β Create fix issueβ
β β β on same branch β
βββββββΌβββ βββββΌββββββ ββββββββββ¬βββββββββ
β dev/ β β prod β β
β stage β β β ββββββββΌβββββββ
βββββ¬βββββ βββββ¬ββββββ β New PR β
β β β (same branch)β
β ββββββββΌβββββββ ββββββββββββββββ
β β Awaiting β
β β Approval β
β ββββββββ¬βββββββ
β β Human approves
β β (Telegram/Slack/GitHub Review)
ββββββ¬ββββββ
β
ββββββββΌβββββββ
β Merging β β Squash merge (default)
ββββββββ¬βββββββ
β
ββββββββΌβββββββ
β Merged β
ββββββββ¬βββββββ
β
ββββββββΌβββββββββββ
β Post-Merge CI β β Optional: wait for main CI
ββββββββ¬βββββββββββ
β
ββββββββΌβββββββ
β Releasing β β Optional: auto-release
βββββββββββββββStage Definitions
| Stage | Description | Next |
|---|---|---|
pr_created | PR exists, entering pipeline | waiting_ci |
waiting_ci | Polling CI status every 30s | ci_passed or ci_failed |
ci_passed | All required checks green | merging (dev/stage) or awaiting_approval (prod) |
ci_failed | CI check failed | Feedback loop β new PR on same branch |
awaiting_approval | Prod mode: waiting for human | merging on approval |
merging | Squash merge in progress | merged |
merged | PR merged to main | post_merge_ci or done |
post_merge_ci | Waiting for main branch CI | releasing or done |
releasing | Creating GitHub release | Done |
failed | Terminal state, requires intervention | β |
Environment Modes
| Environment | Auto-Merge | Approval | CI Wait | Use Case |
|---|---|---|---|---|
dev | Yes | None | 5m timeout | Personal projects, fast iteration |
stage | Yes | None | 30m timeout | Team staging, balanced safety |
prod | No | Required | 30m timeout | Production, maximum control |
Safety Controls
- Circuit breaker: Pauses after
max_failuresconsecutive failures (default: 3) - Merge rate limit:
max_merges_per_hourprevents runaway automation (default: 10) - Approval timeout: Prod mode times out after 1 hour without approval
- Budget limits: Per-task and daily spending caps with warn/pause/stop actions
Component Interaction Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β cmd/pilot/main.go β
β Parses CLI flags, loads config, wires components, starts adapters β
βββββββ¬ββββββββββββββ¬βββββββββββββββ¬βββββββββββββββ¬βββββββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
ββββββββββββ ββββββββββββ ββββββββββββββ ββββββββββββββββ
β GitHub β β Telegram β β Executor β β Autopilot β
β Poller β β Bot β β β β Controller β
β β β β β ββββββββββ β β β
β Issues β β β Msgs β β β βRunner β β β PR lifecycle β
β Executor β β Executor β β βBackend β β β CI monitor β
β β β β β βQuality β β β Auto-merger β
ββββββββββββ ββββββββββββ β βAlerts β β β Feedback loopβ
ββββββββββββ ββββββββββββ β ββββββββββ β ββββββββ¬ββββββββ
β Discord β β Plane β ββββββββ¬ββββββ β
β Bot β β Poller β β β
β β β β β β
β Cmds β β β Issues β β β β
β Executor β β Executor β β β
ββββββββββββ ββββββββββββ β β
β β
ββββββββββββββΌβββββββββββββββ
β β
βΌ βΌ
ββββββββββββ ββββββββββββ
β Memory β β Alerts β
β (SQLite) β β Engine β
β β β β
β Patterns β β Slack β
β History β β Telegram β
ββββββββββββ ββββββββββββData Flow Through Key Structs
Task (input) ExecutionResult (output)
ββ Title, Body ββ Success bool
ββ ProjectPath ββ CommitSHA
ββ BranchName ββ BranchName
ββ IssueNumber ββ PRNumber, PRURL
ββ Images []string ββ TokensIn, TokensOut
ββ Labels []string ββ Cost
ββ Duration
β β
β runner.Execute() β
ββββββββββββββββββββββββββββββββΆβ
β
βΌ
PRState (autopilot)
ββ PRNumber
ββ HeadSHA (refreshed from API)
ββ BranchName
ββ Stage (PRStage enum)
ββ CIStatus
ββ MergeAttemptsContext Intelligence
Pilotβs context intelligence layer provides structured codebase context. When Pilot detects a .agent/ directory in the project, it activates the context engine by injecting a session initialization command into the prompt.
Pilot Prompt Construction:
ββββββββββββββββββββββββββ
β System instructions β β Model routing, task constraints
ββββββββββββββββββββββββββ€
β Issue title + body β β The actual work to do
ββββββββββββββββββββββββββ€
β "Start my Navigator β β Activates context engine
β session." β
ββββββββββββββββββββββββββ€
β Branch + PR metadata β β Git context
ββββββββββββββββββββββββββ€
β Budget constraints β β Token/cost limits
ββββββββββββββββββββββββββContext engine phases detected in the execution stream:
| Phase | Signal | Description |
|---|---|---|
navigator | Navigator Session Started | Context engine loaded, docs indexed |
task-mode | TASK MODE ACTIVATED | Task planning begins |
research | PHASE: β RESEARCH | Codebase exploration |
implementing | PHASE: β IMPL | Writing code |
verifying | PHASE: β VERIFY | Running tests, checking output |
complete | Completion signal | Task finished |
Context intelligence is Pilotβs core value proposition. The prompt injection (Start my Navigator session.) must never be removed β itβs what gives Pilot deep codebase understanding rather than generic code generation.
CI Feedback Loop
When CI fails on a Pilot PR, the feedback loop activates:
CI Failure Detected
β
βΌ
βββββββββββββββββββββββ
β Close failed PR β β Unblocks sequential poller
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β Create fix issue β β Title: "Fix CI: {original title}"
β with metadata β β Body: <!-- autopilot-meta branch:pilot/GH-123 -->
β β β Labels: pilot, autopilot-fix
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β Poller picks up β β parseAutopilotBranch() reads metadata
β fix issue β β Checks out SAME branch, not new one
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β Execute fix β β Claude sees CI error logs
β on original branch β β Pushes fix commits
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β New PR from same β β Same branch, fresh PR
β branch β β Re-enters autopilot pipeline
βββββββββββββββββββββββThe branch metadata comment (<!-- autopilot-meta branch:X -->) ensures the fix happens on the original branch rather than creating a new one.
Budget Enforcement
Budget limits are enforced at two levels:
| Level | Limits | Action on Exceed |
|---|---|---|
| Per-task | Token count, duration | Stop current execution |
| Daily/Monthly | USD spending cap | Warn at 80%, pause new tasks, or stop all |
Actions escalate through three levels:
warnβ Notify but continue executionpauseβ Finish current task, block new onesstopβ Terminate immediately
Budget status is checked before each task dispatch and during execution via the alert bridge.
Configuration Hierarchy
# ~/.pilot/config.yaml
gateway: # HTTP + WebSocket server binding
adapters: # Input sources (telegram, github, slack, linear, jira, discord, plane)
github:
polling:
interval: 30s
label: "pilot"
discord:
bot_token: ""
guild_id: ""
channel_ids: []
plane:
api_url: ""
api_token: ""
project_board: ""
orchestrator: # Execution mode, autopilot config
execution_mode: sequential # sequential | parallel | auto
autopilot:
environment: stage # dev | stage | prod
auto_merge: true
ci_poll_interval: 30s
max_failures: 3 # Circuit breaker threshold
executor: # Backend, model routing, decomposition
backend: claude-code
model_routing:
trivial: claude-haiku-4-5-20251001
simple: claude-sonnet-4-6
medium: claude-sonnet-4-6
complex: claude-opus-4-6
memory: # SQLite persistence
quality: # Gate configuration (build, test, lint, coverage)
alerts: # Event routing to notification channels
budget: # Cost controls and limits
projects: # Multi-project configurationDatabase Schema
Pilot persists execution history and learned patterns in SQLite (WAL mode for concurrency):
-- Task execution history
executions (id, task_id, project_path, status, started_at,
completed_at, duration_ms, output, error, commit_sha, pr_url)
-- Cross-project patterns (learned behaviors)
cross_patterns (id, title, description, type, scope,
confidence, occurrences, is_anti_pattern)
-- Task queue for dispatcher serialization
task_queue (id, project_path, task_json, status,
created_at, started_at, completed_at)Whatβs Next
This architecture supports the following planned documentation pages:
- Model Routing β Deep dive into complexity classification and model selection
- Security Model β Sandbox mode, webhook validation, token handling, trust boundaries