Multi-CLI Parallelism & Conflict Prevention

Executive Summary

Critical finding: In any shared-main scenario without filesystem isolation, the second write wins — both agents read identical file state, make independent decisions, and write back to the same paths, producing undetected data loss with no merge conflict raised. Zero surveyed practitioners advocate shared-main-plus-locks; every production deployment uses one worktree per agent.^[2]

The evidence across practitioner deployments, academic datasets, and open-source tooling has converged on a single isolation primitive: one task = one branch = one worktree = one agent. Branch-per-issue without filesystem isolation fails for the same reason as shared-main — all agents share one working directory, so a checkout by any agent changes what all others are reading and writing, silently corrupting their context regardless of branch-level isolation.^[2]^[14] Git worktrees solve this with separate checked-out copies sharing a single .git object store: each agent gets an independent working directory, index, and branch reference, with disk overhead of approximately 5 GB per worktree on a 2 GB codebase.^[2] Claude Code added native worktree support in v2.1.50 via the --worktree flag, which auto-creates named worktrees at .claude/worktrees/<name>/.^[1]^[15]

The AgenticFlict dataset — 107,026 simulated AI-generated pull requests across 59,000+ repositories — provides the most rigorous measurement of conflict rates at scale: 27.67% overall conflict rate, with per-agent variability ranging from 15.24% (GitHub Copilot) to 31.85% (OpenAI Codex).^[4] Each conflicting PR averaged 4.36 affected files and 11+ conflict regions. The near-linear relationship between PR size and conflict rate is the sharpest implication: small PRs conflict at ~9.9%, while large PRs conflict at ~32–33%.^[4] The practical corollary is that task scoping is a coordination mechanism, not just a code quality practice — keeping each agent's changeset small and file-exclusive is the single highest-leverage intervention available before any coordination infrastructure is deployed.

Textual merge conflicts, however, are only the detectable subset of coordination failures. The Semantic Consensus Framework research identifies Semantic Intent Divergence (SID) — agents developing conflicting interpretations of shared objectives even when their file edits do not textually conflict.^[5] Across 600 controlled experimental runs on AutoGen, CrewAI, and LangGraph, baseline workflow failure rates ranged from 41% to 86.7% depending on the framework. Ungoverned execution achieved just 0.2% completion; the Semantic Consensus Framework reached 100% by capturing agent intentions as a directed graph before execution.^[5] The root cause attribution is the most operationally significant finding in the corpus: 79% of multi-agent failures stem from specification and coordination gaps, not model capability limitations.^[5] Better models do not solve the coordination problem.

Worktrees isolate the filesystem only. Teams running 4+ parallel agents on real infrastructure collide on five additional resource categories: ports, SQLite database files, PostgreSQL schemas, Docker daemons, and Redis key namespaces.^[12] Each must be explicitly partitioned per agent — failures on shared ports or databases produce symptoms indistinguishable from code bugs, consuming debugging cycles that attribution to the coordination layer would immediately resolve. The incident.io case study documents one successful configuration: 4–5 simultaneous agents, each assigned a domain with zero file overlap (UI, build tools, tests, backend).^[13] Three independent sources converge on 5–8 concurrent agents as the practical ceiling — limited not by compute, disk, or coordination overhead, but by human review capacity.^[2]^[3]^[12]

For issue tracking and work assignment, multiple independent implementations have settled on local SQLite over GitHub Issues. The structural argument is latency and rate limits: GitHub's API returns results in 100–500 ms with a ceiling of 5,000 authenticated requests per hour, while SQLite queries resolve in microseconds with no rate limit.^[6] The behavioral argument is atomicity: GitHub Issues has no native atomic claim operation, exposing work-assignment to TOCTOU races when multiple agents query the same backlog simultaneously. SQLite's UNIQUE constraint solves this with a single INSERT — if two agents attempt to claim the same issue, exactly one succeeds and the other receives a CONFLICT with no polling required.^[7]^[11] Both Beads and parallel-cc independently converged on this two-layer pattern: SQL enforces atomicity, while lane JSON files on the filesystem provide fast local reads for agents checking their own state.

Dead-agent recovery is a first-class concern in any multi-agent system. Pessimistic locking fails categorically: a dead agent holding a lock blocks all work indefinitely. Heartbeat-based reclamation handles this without human intervention — each active agent updates a timestamp; a configurable stale timeout releases uncompleted work back to the queue when the heartbeat stops.^[9]^[11] The Kleisli.IO analysis reframes the entire coordination problem: "Multi-session AI agent workflows are concurrent processes with private state, independent failure modes, and shared mutable resources" — treating them as distributed systems problems motivates lock-free primitives (atomic SQL claims, event sourcing, CRDTs, stigmergic traces) over approaches borrowed from single-process concurrency.^[9]

The recce.hq 4-gate architecture documents the most concretely measured quality gate configuration in the corpus: implementing four sequential gates increased weekly commits from 20–50 to 100–200+ while maintaining code quality standards.^[8] The architecture distinguishes hard (binary) gates from soft (LLM-readable) gates. Instruction files — AGENTS.md and CLAUDE.md — are Gate 1 and soft: they define file ownership, prohibited zones, and build commands, but agents can rationalize exceptions to text rules. Pre-commit hooks (Biome linting) are Gate 2 and hard: binary pass/fail, no exceptions rationalized. Pre-push hooks (typecheck, full test suite, security scans) are Gate 3 and hard. Human review is Gate 4 and soft, reserved for logical and business logic concerns that binary checks cannot evaluate.^[8] The architectural principle is direct: "Hooks are binary — agents cannot rationalize exceptions." An agent can decide to ignore a CLAUDE.md rule; it cannot decide to make a failing test suite pass. Instruction files are necessary documentation but require binary enforcement to function as real coordination gates.

A coordination failure mode absent from the merge-conflict literature deserves explicit attention: agent "dementia," documented by Beads, describes agents whose context window exhausts within approximately 10 minutes, causing them to forget their current scope and re-explore the codebase — overlapping with other agents' claimed work without any file-level collision detectable by git.^[6] Fine-grained task scoping addresses this directly: keeping agents early in their context windows, where decision quality and scope adherence are highest, is a coordination benefit independent of any quality benefit.

Practitioners building multi-CLI systems should sequence their investments in this order: First, establish filesystem isolation via worktrees and explicit resource partitioning (ports, databases, namespaces) per agent — without this, no higher-level coordination is reliable. Second, implement atomic SQLite-based work assignment with heartbeat reclamation to handle the full crash-and-recovery lifecycle without human intervention. Third, add binary pre-push gates (typecheck, tests) as the merge quality layer — instruction files alone are insufficient. Fourth, enforce small-task scoping as a structural rule, not a guideline: empirically, tasks that touch fewer files produce conflicts at roughly one-third the rate of large tasks, and small tasks also protect agent context coherence. The ceiling of 5–8 concurrent agents is not a technical limit but a human review limit — teams that automate their review tier (Generator-Verifier pattern, automated PR checks) can extend it further, but teams that skip the infrastructure to reach that ceiling faster will spend the savings on debugging invisible coordination failures.

Section 1: Isolation Architecture — The Worktree-Per-CLI Consensus

The dominant pattern across practitioner sources and production deployments is unambiguous: one task = one branch = one worktree = one agent. Running multiple Claude Code sessions against the same working directory without filesystem isolation causes agents to overwrite each other's edits and corrupt each other's context — described by one practitioner as "all hell breaks loose."^[14]^[11] No surveyed source advocates shared-main with file locks as a preferred approach.

Isolation Strategy Comparison

Strategy	Filesystem Isolation	Branch Isolation	Disk Overhead	Setup Complexity	Recommended By
Worktree-per-CLI^[1]	Yes (separate checkout)	Yes (independent branches)	~5 GB per worktree on 2 GB codebase^[2]	Low (1 command)	appxlab.io^[2], shareuhack.com^[3], mindstudio.ai^[12], claudefa.st^[1], Dev.to^[14], recce.hq^[8]
Branch-per-issue, shared checkout	No	Yes	None	None	Not recommended — shared working directory causes context corruption^[2]^[14]
Shared-main + locks	No	No	None	High (lock management)	0 sources — explicit anti-pattern^[2]
Full clone per agent	Yes	Yes	Doubles codebase per agent	Low	(standard git alternative; not discussed by surveyed practitioners — see git-worktree docs)

The branch-per-issue, shared checkout approach fails because all agents share the same working directory: a checkout by one agent changes what all other agents are reading and writing, causing silent context corruption regardless of branch-level isolation.^[2]^[14]

Section 2: Technical Foundations of Git Worktree Isolation

Git worktrees create separate checked-out copies that share a single .git object store. Each agent has an independent working directory, index, and branch reference, while disk usage remains lower than full clones because objects are shared.^[1]^[2]

Native Claude Code Support (v2.1.50+)

Claude Code added native worktree support in v2.1.50 via the --worktree flag. The flag auto-creates named worktrees at .claude/worktrees/<name>/, branching from the default remote as worktree-<name>.^[1]^[15]^[3]

Worktree Lifecycle Commands

Cleanup Behavior

Operation	Native Claude Code (v2.1.50+)	Manual (pre-v2.1.50 / non-CC)	Source
Create named worktree	`claude --worktree feature-auth`	`git worktree add ../project-feature-a -b feature-a`	^[1]^[14]
Create auto-named worktree	`claude --worktree`	—	^[3]
Launch agent in worktree	Built-in	`cd ../project-feature-a && claude`	^[14]
Cleanup	Auto (no changes) / persists (modified)	`git worktree remove ../project-feature-a`	^[1]^[15]
Batch parallel launch	`/batch` command	—	^[15]
Subagent isolation	`isolation: "worktree"` frontmatter	—	^[15]

Worktrees with no changes are automatically deleted upon session completion. Modified worktrees persist for human review. Teams add .claude/worktrees/ to .gitignore to prevent version control pollution.^[1]^[15] See Section 8 for heartbeat-based stale detection and automatic work reclamation patterns.

Section 3: Shared Resources Not Isolated by Worktrees

Worktrees isolate the filesystem only. Teams running 4+ parallel agents frequently collide on other shared resources. mindstudio.ai documents five resource categories requiring explicit per-agent isolation beyond worktrees.^[12]

Resource Isolation Requirements per Agent

Section 4: Concurrent Edit Conflicts — Scale and Measurement

Resource	Isolation Mechanism	Example
Ports^[12]	Per-agent port assignment	Agent 1: 8000, Agent 2: 8001, Agent 3: 8002
SQLite databases^[12]	Per-agent DB file	`test_agent1.db`, `test_agent2.db`
PostgreSQL^[12]	Per-agent schema or database branch	Neon database branching
Docker daemons^[12]	Per-agent daemon or container namespace	—
Redis^[12]	Per-agent key prefix	`agent1:`, `agent2:`
API keys / secrets^[12]	Per-agent credentials or sandboxed accounts	E2B sandbox integration^[11]

The AgenticFlict dataset provides the most rigorous production measurement of AI-generated merge conflicts to date: 107,026 simulated AI-generated pull requests across 59,000+ repositories, revealing a 27.67% overall conflict rate.^[4]

AgenticFlict Dataset: Key Statistics

Metric	Value	Source
Total simulated PRs analyzed	107,026	^[4]
Repositories sampled	59,000+	^[4]
Overall conflict rate	27.67%	^[4]
Avg. affected files per conflicting PR	4.36	^[4]
Avg. conflict regions per conflicting PR	11+	^[4]
GitHub Copilot conflict rate	15.24%	^[4]
OpenAI Codex conflict rate	31.85%	^[4]
Small PR conflict rate	~9.9%	^[4]
Large PR conflict rate	~32–33%	^[4]

The ~2× variance in conflict rates between AI systems (Copilot 15.24% vs. Codex 31.85%) indicates that the choice of coding agent substantially affects coordination overhead — not just code quality.^[4] The near-linear relationship between PR size and conflict rate provides a strong empirical argument for small, scoped tasks.

File Claim Coordination: Pre-Write vs Post-Write Detection

The parallel-cc project implements file claim coordination as a pre-write registry — agents declare file ownership before writing, preventing concurrent writes from being attempted at all.^[11] This operates upstream of git's post-write conflict detection. Additionally, parallel-cc uses AST analysis — not just line-level diffs — for merge conflict detection, understanding code structure rather than text changes.^[11]

Section 5: Semantic Intent Divergence — Beyond Textual Conflicts

The Semantic Consensus paper identifies a failure mode orthogonal to git merge conflicts: Semantic Intent Divergence (SID), where agents develop conflicting interpretations of shared objectives even when their file edits do not textually conflict.^[5] Across 600 experimental runs (controlled simulations on AutoGen, CrewAI, and LangGraph — academic research, not live production deployments), multi-agent workflows failed at rates ranging from 41% (best-performing framework) to 86.7% (worst-performing) across the three frameworks tested.^[5] No evidence of the Semantic Consensus Framework being deployed in production CLI-scale workflows appears in the research corpus.

Semantic Consensus Framework (SCF): Experimental Results

Approach	Workflow Completion Rate	Notes
Ungoverned execution	0.2%	Baseline — no coordination
Next-best baseline	25.1%	—
Semantic Consensus Framework (SCF)	100%	4,200-line Python middleware

Source: Semantic Consensus paper, 600 controlled experimental runs. Per-framework baseline failure rates: AutoGen, CrewAI, and LangGraph showed 41–86.7% workflow failure rates (13–59% completion) without coordination, depending on framework — context that makes the SCF’s 100% completion result meaningful. (Academic research context; no production CLI-scale deployments confirmed.)^[5]

SCF Root Cause Attribution

Of all multi-agent failures analyzed, 79% stem from specification and coordination issues, not model capability limitations.^[5] The SCF conflict detection achieves 65.2% recall with 27.9% precision, deliberately biased toward conservative blocking to prevent downstream cascade failures.^[5]

SCF Architecture Components

Section 6: Issue Tracker Architecture — SQLite vs GitHub Issues vs Hybrid

Component	Function
Process Context Layer^[5]	Ingests BPMN workflow definitions to establish process boundaries
Semantic Intent Graph^[5]	Captures agent intentions as a directed graph before execution
Conflict Detection Engine^[5]	Identifies contradictory intents, resource contention, causal violations
Consensus Resolution Protocol^[5]	Priority order: policy authority → capability authority → temporal priority
Drift Monitor^[5]	Detects gradual semantic divergence in long-running workflows
Process-Aware Governance^[5]	Framework-agnostic via adapter interfaces

Multiple independent implementations have converged on the same conclusion: agent-native issue tracking belongs in local SQLite, not GitHub Issues. The argument is both structural (network latency, API rate limits) and behavioral (agents query work at high frequency; GitHub Issues is not optimized for programmatic access patterns).^[6]^[11]

Tracker Architecture Comparison

Beads Hybrid Architecture (SQLite + JSONL)

Attribute	SQLite Local (Beads)	GitHub Issues	Hybrid (SQLite + JSONL)
Query latency	Microseconds	100–500 ms (network)	Microseconds (local path)
Rate limits	None	5,000 req/hr (Auth), 60 req/hr (Unauth) (standard published limits; see GitHub REST API documentation)	None (local operations)
Offline operation	Yes	No	Yes
Atomic claim support	Yes (UNIQUE constraint)	No native atomic operations	Yes (SQL layer)
Multi-agent concurrent writes	WAL mode + UNIQUE constraints	Race conditions on concurrent updates	WAL mode; Dolt server for high concurrency
Distributed collaboration	No native sync	Yes (central server)	Yes (git-tracked JSONL)^[6]
Production evidence	Beads^[6], parallel-cc^[11]	Not recommended for agents^[6]	Beads hybrid model^[6]

Beads implements a two-store architecture that separates performance from collaboration:^[6]

High-Concurrency: Dolt Server Mode

When multiple agents write concurrently (beyond standard WAL mode), Beads recommends Dolt server mode:^[7]

Section 7: Atomic Claim Patterns and Race Condition Prevention

Two independent implementations — Beads and parallel-cc — converged on the same minimal coordination primitive: the atomic claim. The pattern uses SQLite's UNIQUE constraint as the synchronization mechanism, providing TOCTOU-safe work assignment without polling or locking.^[7]^[11]

Atomic Claim Protocol

Two-Layer Claim Architecture

Step	Operation	Mechanism
1	Agent queries available work	`bd ready --json` or equivalent
2	Agent attempts atomic claim	`INSERT INTO claims (issue_id, agent_id) VALUES (?, ?)`
3a (success)	SQLite UNIQUE constraint passes	Agent owns the issue; proceeds with work
3b (conflict)	SQLite UNIQUE constraint rejects duplicate	Agent receives CONFLICT; picks a different issue
4	Agent queries who holds conflicted issue	SELECT on claims table

The production implementation uses a two-layer design separating atomicity from read performance:^[7]

This design means agents reading their own state can use filesystem reads (fast), while coordination safety is enforced by the SQL layer (atomic).^[7]

Section 8: Lock-Free Coordination from Real Multi-Agent Systems

The Kleisli.IO analysis frames the core insight: "Multi-session AI agent workflows are concurrent processes with private state, independent failure modes, and shared mutable resources."^[9] This reframe — treating multi-agent AI as a distributed systems problem — motivates lock-free approaches over pessimistic locking, which is inappropriate for agents that may crash, get rate-limited, or run for unpredictable durations.

Lock-Free Coordination Mechanisms

Three Distributed Coordination Query Types

Mechanism	Implementation	Trade-offs	Source
Atomic SQL claim	SQLite UNIQUE INSERT	Fast, no polling; requires shared DB file access	^[7]^[11]
Event sourcing	Append-only JSONL logs	Full audit trail; logs grow without compaction	^[9]
CRDTs	Conflict-free replicated data types	Convergent merges; overhead for solo devs; no arbitrary analytics	^[9]
Stigmergic coordination	Sessions leave traces for subsequent sessions	Indirect; no direct communication needed	^[9]
Heartbeat + stale timeout	Timestamp update + configurable reclaim window	Automatic recovery; requires tuning timeout window	^[11]
File claim registry	Pre-write ownership declaration	Prevents conflicts before they occur; requires cooperative agents	^[11]

The Kleisli framework identifies three fundamental coordination query patterns that any multi-agent system must address:^[9]

Cross-Session Knowledge Transfer via Stigmergy

The kli system validated stigmergic coordination in production: Session 1 discovered an "alist/plist mismatch that silently broke WebSocket communication." Subsequent sessions inherited the diagnostic reasoning, not just the fixed code. This demonstrates that lock-free systems can transfer analytical context across sessions without direct inter-session communication.^[9]

Section 9: Pre-Push and Merge Gate Architecture

The recce.hq 4-gate architecture documents a shipped configuration that increased weekly commits from 20–50 to 100–200+ while maintaining code quality standards.^[8] The architecture is additive — each gate layer handles a distinct class of failure.

4-Gate Architecture: Shipped Configuration

Generator-Verifier at the Merge Gate Level

Gate	Type	Mechanism	What It Blocks
Gate 1: Instruction Files^[8]	Soft (LLM-read)	`AGENTS.md` (universal) + `CLAUDE.md` (tool-specific)	Misaligned agent behavior; defines prohibited zones and state file rules
Gate 2: Pre-commit^[8]	Hard (binary)	Biome linting; missing type annotations, unused variables, `any` types	Style drift; agents cannot rationalize exceptions
Gate 3: Pre-push^[8]	Hard (binary)	Type checking (mypy/tsc), full test suite, security scans	Type errors, test failures, security issues leaving local development
Gate 4: Human Review^[8]	Soft (judgment)	Multi-agent cross-review: Claude (dev), Copilot (mechanics), humans (edge cases)	Logical errors, edge cases, business logic misalignment

The Anthropic coordination patterns document describes the Generator-Verifier pattern applied at the PR level: one agent produces output, another validates against explicit criteria.^[10] Applied to merge gates, this means a dedicated verification agent runs against every PR before merge is allowed. Weakness: vague verification criteria break the pattern — explicit, measurable criteria are required.^[10]

Measured Impact of Gate Architecture

Section 10: Production Concurrency Limits and Evidence

Metric	Before Gates	After 4-Gate Architecture	Source
Weekly commits	20–50	100–200+	^[8]
Change set size	Large, hard to review	Smaller, more reviewable (60–90% context savings reported)	^[8]

Three independent sources converge on a practical ceiling of 5–8 concurrent agents on the same codebase. The limiting factor is not compute, disk, or coordination overhead — it is human review capacity.^[2]^[3]^[12]

Concurrent Agent Ceiling: Multi-Source Summary

Production Case Studies

Source	Cited Ceiling	Primary Bottleneck
appxlab.io^[2]	5–7 agents	Review overhead + resource contention
shareuhack.com^[3]	5–7 agents	"Productive ceiling" (practical, not technical)
mindstudio.ai^[12]	5–8 agents	Review capacity, not technical limits

Team	Simultaneous Agents	Domain Partitioning	Outcome	Source
incident.io	4–5 agents	UI, build tools, tests, backend — "each in its own file domain with zero overlap"	Successful parallel development	^[13]
Boris Cherny (Claude Code creator)	10–15 sessions	5 terminal sessions × 5 git checkouts + 5–10 browser/mobile sessions	Sustained operation; "less intervention = faster results"	^[3]^[13]

Note on the Cherny anomaly: Cherny's 10–15 simultaneous sessions appear to contradict the 5–8 ceiling. The resolution: Cherny's sessions include browser/mobile sessions on independent tasks, whereas the 5–8 ceiling refers specifically to simultaneous code-writing agents on the same codebase requiring human review and merge.^[3]^[13]

Resource Budget for Concurrent Worktrees

Each worktree consumes approximately 5 GB on a 2 GB codebase.^[2] Six concurrent agents require 30+ GB of disk. This is a non-trivial infrastructure constraint for on-device development.

Practitioner Skepticism: Real Costs

One practitioner (Dev.to) reports that parallel development "sucks up tokens" to the point of exceeding Claude Pro subscription limits.^[14] The same practitioner describes the human coordination overhead as "endlessly ping-ponging between rooms" and states that parallel development has not become their default workflow despite theoretical benefits.^[14]

Section 11: Agent Coordination Models and Orchestration Patterns

Anthropic documents five production multi-agent patterns with distinct coordination architectures.^[10] The Agent Teams pattern maps directly to multiple concurrent Claude Code CLIs on the same repository.

Anthropic's Five Production Coordination Patterns

Pattern	Best For	Maps to Multi-CLI	Known Weakness
Generator-Verifier^[10]	Quality-critical output with clear evaluation standards	PR merge gates	Vague verification criteria break the pattern
Orchestrator-Subagent^[10]	Clear task decomposition with bounded subtasks	One orchestrator CLI dispatches workers	Orchestrator becomes information bottleneck
Agent Teams^[10]	Parallel workloads, independent features	Direct match — multiple CLIs per worktree	Requires careful partitioning to avoid conflicts
Message Bus^[10]	Growing agent ecosystems	Publish/subscribe event coordination	Debugging complex event cascades is difficult
Shared State^[10]	Collaborative research; no single point of failure	Shared SQLite tracker with atomic claims	Reactive loops and duplicate work require explicit termination

Anthropic recommendation: Start with Orchestrator-Subagent — handles the widest range of tasks with minimal overhead. Evolve to other patterns as limitations emerge.^[10]

Three Practical Coordination Models (mindstudio.ai)

Task Decomposition as Conflict Prevention

Model	Structure	Best Fit	Merge Strategy
Fully Independent Agents^[12]	Separate features, no dependencies	Frontend UI + backend API worked in parallel	Merge in any order
Operator + Workers^[12]	One orchestrator, specialized sub-agents	Divide-and-conquer problems	Operator coordinates merge order
Split-and-Merge^[12]	Single complex task distributed across agents	Large refactors	Explicit merge strategy required before split

appxlab.io documents a three-principle decomposition checklist that measurably reduces coordination conflicts:^[2]

appxlab.io also reports that "incorporating architectural documentation into agent context produces measurable gains in functional correctness" — agents given explicit scope boundaries produce fewer coordination failures than agents given only task descriptions.^[2]

PR Merge Strategy for Parallel Branches

appxlab.io recommends: open draft PRs immediately to surface scope overlap early; merge in dependency order; use lead-agent orchestration for speed (requires clean decomposition) or human review gates for production safety.^[2]

Section 12: AGENTS.md and CLAUDE.md as Coordination Contracts

Four independent sources converge on instruction files as the first layer of coordination infrastructure — establishing explicit rules for file ownership, prohibited zones, build commands, and code conventions before code is written.^[8]^[2]^[3]^[12]

Three-Layer CLAUDE.md Architecture

Layer	Path	Scope	Size Guideline	Shared Across Worktrees?
Global^[3]	`~/.claude/CLAUDE.md`	Personal preferences, developer identity	Under 50 lines	Yes (all projects)
Project^[3]	`./CLAUDE.md`	Team standards, file ownership boundaries	Under 200 lines	Yes (committed to git)
Task^[3]	`.claude/rules/`	Granular rules for specific file paths	—	Yes (git-tracked)

A critical property: all worktrees share the auto-memory directory, enabling cross-session learning without explicit synchronization.^[3]^[13]

Coordination Contract Contents (AGENTS.md Pattern)

recce.hq documents the AGENTS.md coordination contract as Gate 1 in their shipped architecture, with content organized around explicit boundaries:^[8]

Section 13: Practitioner Tooling Survey

Three dedicated tooling systems address multi-CLI coordination with distinct architectural philosophies: parallel-cc (filesystem + SQLite), Beads (issue tracker + atomic claims), and kli (event sourcing + CRDTs).

Tool Comparison: Capability Matrix

parallel-cc: Shipped Feature Summary

Capability	parallel-cc^[11]	Beads^[6]^[7]	kli^[9]
Auto worktree creation	Yes (zero config)	No (separate concern)	No
Session tracking	SQLite + heartbeat	SQLite (per-agent assignee)	Append-only JSONL
Dead-agent recovery	Heartbeat timeout + auto-reclaim	Configurable stale timeout	Session stops producing events; manual reclaim
Atomic work assignment	Yes (file claim + SQL)	Yes (UNIQUE constraint)	Indirect (event sourcing)
Conflict detection method	AST analysis + file claims	TOCTOU-safe SQL	CRDT merge
Git integration	Worktrees + Git Live mode (auto PR)	JSONL git sync	Plain files under version control
Cloud/sandbox support	E2B sandbox (1-hour sessions)	No	No
External dependencies	Node.js 20+, gtr, jq	Python, SQLite; Dolt (optional)	None (plain files)
Distribution model	Open source (GitHub)	Open source (GitHub)	Open source (described in blog)

Installation: ./scripts/install.sh --all, creates ~/.parallel-cc/ database directory. Requirements: Node.js 20+, gtr, jq, git.^[11] Key features beyond coordination:

Beads: Agent Context Window Management

Beads addresses a coordination problem not explicitly covered by worktree isolation: agent "dementia" — context window exhaustion causing agents to forget their task state every ~10 minutes.^[6] Fine-grained Beads issues keep agents early in their context windows, where decision quality is highest. This is a coordination problem because an agent that forgets its scope will re-explore and potentially overlap with other agents' work.^[6]

kli: No-Dependency Architecture

The kli system's defining characteristic is zero external dependencies — "plain files under version control."^[9] This makes it reproducible and auditable but limits query expressiveness compared to SQLite-backed systems. CRDTs ensure that any two sessions can merge their state without coordination, at the cost of additional per-operation overhead that becomes significant for solo developers.^[9]