Contract-Based Autonomous Coordination

When contracts are explicit and acceptance criteria are measurable, agents can work with minimal oversight. Your job shifts from constant monitoring to clear communication upfront. You define the boundary once, with precision. Then agents work autonomously. When they're done, hooks notify you. You review. You integrate.

That's the path to creative independence.

By the end of this lesson, you'll understand how to write contracts so clear that 5-10 agents can work in parallel without asking for clarification. You'll configure hooks that let you track progress without constant checking. And you'll experience the trust that comes from boundaries so well-defined that you can step back and focus on strategy instead of supervision.

From Scaling Problems to Contract Solutions

In Lesson 4, you identified what breaks at scale:

❌ Vague integration points → merge conflicts during integration
❌ Implicit dependencies → agents block waiting for each other
❌ Ad-hoc communication → coordination overhead kills productivity

Contracts solve all of these:

✅ Explicit integration contracts → clean merges (agents know exact boundaries)
✅ Documented dependencies → autonomous work (no blocking)
✅ Specifications as communication → async coordination (no meetings)

This lesson teaches you HOW to write contracts that enable 5-7 agent coordination without synchronous meetings or constant oversight.

Why Contracts Enable Autonomy

Let's start with the honest truth: at Lesson 5 scale (5 agents), constant monitoring doesn't scale. At 7-9 agents, it's impossible.

The Monitoring Trap

Picture yourself in Lesson 3. You have agents building features in parallel. Here's what your workflow looked like:

Agent 1 is 30 minutes into implementation. Are they on track? Better check.
Agent 2 hit an unexpected design question. Are they blocked? Better check.
Agent 3 is finished. Ready to integrate? Better review their code.
Agent 4... what's Agent 4 doing? Better check that too.

At 4 agents, you're constantly jumping between contexts. At 10 agents, you're overwhelmed. At 15 agents, it's impossible.

The Contract Solution

Now imagine instead:

You write ONE integration contract upfront. It defines:

What Feature A provides: "User API with endpoints [list]"
What Feature A depends on: "Requires user database from Feature B"
Acceptance criteria: "A is complete when: endpoints respond in <100ms, all test cases pass, error handling covers 5 scenarios"

Then you let agents work. No checking in. No status meetings. No clarifying questions.

When Feature A is done, a hook fires: "Feature A complete. Acceptance criteria met. Ready for integration."

You review the contract. Did they meet it? Yes? Integrate. No? Return with specific feedback.

That's autonomous coordination. Contracts + Hooks = Trust Without Micromanagement.

Why This Works

The magic is in explicitness. When a contract is vague—"Build a good user API"—agents must guess your intent. They'll ask. You'll clarify. They'll redesign. You'll iterate.

But when a contract is precise—"User API with endpoints [specific list], responses under 100ms, 5 error scenarios handled"—there's no ambiguity. Agents implement to specification. Done.

Vague contracts = coordination overhead. Clear contracts = autonomous work.

Here's the data: Teams using explicit contracts report 60% reduction in sync meetings and 40% faster integration cycles. Why? Because the contract did the communicating, not you.

Writing Integration Contracts

Let's build your contract-writing skill. A good contract answers four questions:

What does this feature provide? (APIs, data models, services)
What does this feature depend on? (inputs, external systems, other features)
How will it integrate? (APIs, data flow, shared state)
How will we know it's done? (acceptance criteria)

Contract Structure: The Template

Here's the structure you'll use. Think of it as a legally binding agreement between agent and human orchestrator:

# Feature [Name] Integration Contract

## Provides

### APIs
- [Endpoint path]: [Description]
  - Input: [Schema]
  - Output: [Schema]
  - Error cases: [List]

### Data Models
- [Model name]: [Columns/fields]

### External Services
- [Service name]: [What it does]

## Depends On

### Required Features
- [Feature name] → [What we need from it]

### External Services
- [Service name] → [How we use it]

### Data Requirements
- [Database/service] → [Specific tables/endpoints]

## Integration Points

### How We Connect
- [Feature A] calls [Feature B's endpoint] to [accomplish task]
- [Feature A] reads from [Feature B's data model]
- Conflict resolution: [If contention, do this]

### Performance Targets
- [Operation] must complete in [X]ms
- Concurrency limit: [N] simultaneous requests
- Data consistency model: [eventual/strong]

## Acceptance Criteria

- [ ] All APIs respond in &lt; [X]ms under load
- [ ] All [N] test cases pass
- [ ] Error handling covers [list of scenarios]
- [ ] Integration test with Feature [X] succeeds
- [ ] Code review: [specific checklist items]
- [ ] Documentation: [README updated with examples]

This template is concrete. It's measurable. An agent can't wiggle out of it with interpretation—the contract is the specification.

Defining Acceptance Criteria

The difference between vague and autonomous is acceptance criteria. Vague criteria require human judgment. Measurable criteria enable autonomous verification.

The Vagueness Problem

Bad acceptance criteria:

"The API should be fast"
"Error handling should be robust"
"The code should be clean"

These require judgment. "Fast" to you might be "slow" to an agent. "Robust" is subjective. "Clean" is in the eye of the reviewer. Vague criteria = coordination overhead.

The Measurable Solution

Good acceptance criteria:

"All endpoints respond in <100ms at p95 latency"
"Error handling covers: invalid input, database failures, timeout scenarios [5 specific cases]"
"Code passes linter, has >80% test coverage, includes docstrings on all public functions"

These are testable. An agent can verify them. A human can verify them. No ambiguity.

Code Example 2: Acceptance Criteria Table

Here's a concrete example. For the Product Catalog feature, here are the acceptance criteria:

Category	Criterion	How to Verify	Pass/Fail
Functionality	All 3 endpoints implemented	curl each endpoint	[ ]
Functionality	Pagination works: 1 ≤ page ≤ max	Test with page=1, page=100, page=999	[ ]
Functionality	Stock decrement prevents overselling	Concurrent PATCH requests, verify no negative stock	[ ]
Performance	GET requests < 100ms (p95)	Load test: 100 req/s for 1 min, measure p95	[ ]
Performance	PATCH requests < 200ms	Load test: 50 req/s for 1 min, measure p95	[ ]
Error Handling	Invalid product ID returns 404	GET /api/products/invalid-uuid → 404	[ ]
Error Handling	Malformed request returns 400	GET /api/products?page=abc → 400	[ ]
Error Handling	Stock insufficient returns 400	PATCH with delta > current stock → 400	[ ]
Integration	Feature 3 can read products	Feature 3 calls `GET /products/{id}`, verifies response	[ ]
Integration	Feature 3 can decrement stock	Feature 3 calls PATCH, verifies stock updates	[ ]
Code Quality	No SQL injection vulnerabilities	Code review + manual injection test	[ ]
Code Quality	All public functions documented	Check docstrings in code	[ ]
Code Quality	Test coverage >= 80%	Run test suite, check coverage report	[ ]

This table is the contract's teeth. When all boxes are checked, Feature 2 is done. No further discussion.

Agents build to this. You verify against this. Integration happens. Clean.

Configuring Completion Hooks

With specifications and contracts in place, you still need one more piece: How do you know when an agent is done?

In Lesson 5, you monitored constantly. But at scale, that's unsustainable. Instead, you use hooks: scripts that fire automatically when certain events occur, notifying the orchestrator that work is complete.

Hook Basics: What Triggers, What Fires

A hook has two parts:

Trigger: When something happens (e.g., agent finishes implementation, tests pass)
Action: What to do about it (e.g., write to a status file, send notification)

Claude Code supports these hook triggers:

on-session-complete: Agent session ends (implementation finished or error occurred)
after-tool-use: After an agent uses a specific tool (e.g., after test command completes)
on-timeout: After X minutes of inactivity

For orchestration, we care about: Did the agent finish? Did they meet acceptance criteria? Are they ready for integration?

Code Example 3: Completion Hook Script

Here's a hook script that fires when Feature 2 finishes implementation:

#!/bin/bash
# File: .claude/hooks/notify-orchestrator.sh
# Purpose: Notify orchestrator when a feature is complete
# Trigger: on-session-complete (fires when agent session ends)

set -e

FEATURE_ID=$1  # e.g., "feature-002-catalog"
SESSION_ID=$2  # Unique session identifier
STATUS_DIR="./.orchestrator-status"

# Create status directory if it doesn't exist
mkdir -p "$STATUS_DIR"

# Cross-platform timestamp (works on macOS, Linux, Windows Git Bash)
if date --version >/dev/null 2>&1; then
  # GNU date (Linux)
  TIMESTAMP=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
else
  # BSD date (macOS)
  TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
fi

# Check acceptance criteria
if [ -f "$FEATURE_ID/ACCEPTANCE_CRITERIA.checked" ]; then
  CRITERIA_MET="true"
else
  CRITERIA_MET="false"
fi

# Write completion record (safer than heredoc with variable substitution)
cat > "$STATUS_DIR/$FEATURE_ID.json" <<EOF
{
  "feature_id": "$FEATURE_ID",
  "session_id": "$SESSION_ID",
  "completed_at": "$TIMESTAMP",
  "status": "complete",
  "acceptance_criteria_verified": $CRITERIA_MET
}
EOF

echo "✓ Feature $FEATURE_ID completion notified at $TIMESTAMP"
echo "  Session ID: $SESSION_ID"

# Use jq if available, otherwise parse manually
if command -v jq >/dev/null 2>&1; then
  echo "  Status: $(jq -r '.status' "$STATUS_DIR/$FEATURE_ID.json")"
else
  echo "  Status: complete"
fi

# Write to shared orchestrator log with file locking (prevents race conditions)
# Use flock on Linux, or >> without locking on macOS (acceptable for low contention)
LOG_FILE=".orchestrator-status.log"
LOG_ENTRY="$TIMESTAMP: $FEATURE_ID complete (session: $SESSION_ID)"

if command -v flock >/dev/null 2>&1; then
  # Linux: use flock for safe concurrent writes
  (
    flock -x 200
    echo "$LOG_ENTRY" >> "$LOG_FILE"
  ) 200>"$LOG_FILE.lock"
else
  # macOS/BSD: simple append (low risk of corruption with short writes)
  echo "$LOG_ENTRY" >> "$LOG_FILE"
fi

This hook is simple but powerful. When an agent finishes, the hook:

Creates a JSON record with the feature ID and session ID
Records the completion timestamp
Appends to a shared orchestrator log that all sessions can read

Now the orchestrator (human or automated) can poll a single file to see what's done, what's pending, what failed.

Code Example 4: Orchestrator Polling Script

But hooks only notify when something happens. What if you want to check progress? Here's the orchestrator-side script that polls the status:

#!/bin/bash
# File: scripts/orchestrator-poll-status.sh
# Purpose: Check which features are complete, what's pending, what failed
# Usage: ./scripts/orchestrator-poll-status.sh

STATUS_DIR=".orchestrator-status"
STATUS_LOG=".orchestrator-status.log"

echo "=== ORCHESTRATOR STATUS REPORT ==="
echo "Checked at: $(date)"
echo ""

# Count completion records
COMPLETE_COUNT=$(ls -1 "$STATUS_DIR"/*.json 2>/dev/null | wc -l)
echo "Features Complete: $COMPLETE_COUNT"
echo ""

# Check if jq is available
HAS_JQ=false
if command -v jq >/dev/null 2>&1; then
  HAS_JQ=true
fi

# List each feature's status
echo "=== FEATURE STATUS ==="
for status_file in "$STATUS_DIR"/*.json; do
  if [ -f "$status_file" ]; then
    if [ "$HAS_JQ" = true ]; then
      # Use jq for clean parsing
      feature=$(jq -r '.feature_id' "$status_file")
      completed=$(jq -r '.completed_at' "$status_file")
      session=$(jq -r '.session_id' "$status_file")
      criteria=$(jq -r '.acceptance_criteria_verified' "$status_file")
    else
      # Fallback: use grep and sed (works without jq)
      feature=$(grep -o '"feature_id"[[:space:]]*:[[:space:]]*"[^"]*"' "$status_file" | cut -d'"' -f4)
      completed=$(grep -o '"completed_at"[[:space:]]*:[[:space:]]*"[^"]*"' "$status_file" | cut -d'"' -f4)
      session=$(grep -o '"session_id"[[:space:]]*:[[:space:]]*"[^"]*"' "$status_file" | cut -d'"' -f4)
      criteria=$(grep -o '"acceptance_criteria_verified"[[:space:]]*:[[:space:]]*[a-z]*' "$status_file" | awk '{print $NF}')
    fi

    echo "✓ $feature"
    echo "  Completed: $completed"
    echo "  Session: $session"
    echo "  Criteria Met: $criteria"
    echo ""
  fi
done

# Show recent log entries (last 10)
echo "=== RECENT ACTIVITY ==="
tail -10 "$STATUS_LOG" 2>/dev/null || echo "No activity log yet"

echo ""
echo "=== ALL_COMPLETE CONDITION ==="
# Check if all expected features are complete
EXPECTED_FEATURES=("feature-001-auth" "feature-002-catalog" "feature-003-cart" "feature-004-payments" "feature-005-orders")
MISSING=()

for expected in "${EXPECTED_FEATURES[@]}"; do
  if [ ! -f "$STATUS_DIR/$expected.json" ]; then
    MISSING+=("$expected")
  fi
done

if [ ${#MISSING[@]} -eq 0 ]; then
  echo "✓ ALL_COMPLETE: All features have completed and reported status"
  echo "✓ Ready for integration phase"
else
  echo "⏳ PENDING: Waiting on $(printf '%s, ' "${MISSING[@]}" | sed 's/, $//')"
fi

Run this script every minute. It tells you:

Which features are done
Which are still working
When you're ready for integration

No manual checking. No status meetings. Just one command.

Platform Compatibility Note: Both scripts include cross-platform compatibility:

Timestamp generation works on Linux (GNU date) and macOS (BSD date)

File locking uses flock on Linux, simple append on macOS (safe for low contention)

JSON parsing checks for jq availability and provides grep/sed fallback

Scripts work on Windows Git Bash without modification

Exercises: Build Your Contract Intuition

You've learned the concepts. Now let's practice. Do these exercises in order. Each one builds your contract-writing skill.

Exercise 1: Write a 3-Feature Contract

You're given a 3-feature system:

System: Task Management App

Feature A: User authentication (login, register, password reset)
Feature B: Task CRUD (create, read, update, delete tasks; assign to users)
Feature C: Task notifications (email alerts when task assigned or due)

Your task: Write contract.md defining the integration contracts for Feature B (Task CRUD).

What to include:

Provides: List all endpoints (at least 4: create, read, update, delete)
Depends on: List dependencies on Feature A (authentication), data models
Integration points: How Feature B connects to Feature C
Acceptance criteria: 8-10 specific, testable criteria

Success: Your contract is unambiguous. An agent could implement it without asking you a single question.

Hint: Think like a lawyer writing a contract. Leave no room for interpretation.

Exercise 2: Define Acceptance Criteria

Using your Feature B contract from Exercise 1, write a detailed acceptance criteria table.

Format:

Category	Criterion	How to Verify	Pass/Fail

What to include:

Functionality criteria (all 4 endpoints work)
Performance criteria (response times)
Error handling criteria (5+ specific error scenarios)
Integration criteria (Feature B works with Feature A and C)
Code quality criteria (testing, documentation)

Success: A QA engineer could use your table to verify the feature without asking you what "complete" means.

Exercise 3: Configure Hooks in a Test Worktree

In a test git worktree, configure completion hooks that fire when an "implementation" finishes.

Steps:

Create directory .claude/hooks/
Create hook script notify-orchestrator.sh (based on Code Example 3)
Create status directory .orchestrator-status/
Create orchestrator polling script scripts/orchestrator-poll-status.sh
Simulate 3 agents finishing: Write 3 JSON files to .orchestrator-status/
Run polling script, verify it reports all 3 complete

Success: Polling script shows all 3 features complete and ready for integration.

Exercise 4: Simulate Parallel Coordination

This is the capstone exercise. You're coordinating 3 agents building Features A, B, C.

Setup:

Create 3 git worktrees: worktree-feature-a, worktree-feature-b, worktree-feature-c
In each worktree, create a placeholder contract.md file
Configure hooks in each worktree

Execution:

In each worktree, run the hook script manually (simulating feature completion)
Use the orchestrator polling script to track progress
Verify that all 3 features show as complete

Success: You see all 3 features complete in the status report, and you haven't checked any of them individually.

The insight: At 3 agents, this feels overkill. But at 15 agents, this is essential. You're building the muscle now.

Try With AI

You've learned to write contracts and configure hooks. Now let's refine your intuition with AI as your thinking partner.

Tool: Claude Code CLI (or your preferred AI companion if you've set one up)

Setup: Open 3 parallel Claude Code sessions (in 3 terminal windows or tmux splits). In each, you'll ask different questions about contracts.

Prompt 1: Contract Quality Check

In Session 1, paste your Feature B contract from Exercise 1 and ask:

Review my integration contract for the Task CRUD feature.

Looking for:
1. Are integration points clear and unambiguous?
2. Are acceptance criteria testable (not subjective)?
3. Did I miss any integration points with Feature A or C?
4. What would an agent find confusing or unclear?

Here's my contract:
[PASTE YOUR CONTRACT HERE]

Expected outcome: AI identifies 2-4 specific gaps. You refine your contract. This is how you develop contract intuition.

Prompt 2: Scaling Hooks

In Session 2, ask about scaling to 10 features:

I currently use hooks and status files to track 3 feature completions.

How would I configure this system to track 10 parallel feature sessions?

Specifically:
1. Would status files scale, or do I need a different mechanism?
2. How would I handle feature dependencies (Feature B can't integrate until Feature A is complete)?
3. What would break if I tried to scale to 15 features?

Here's my current orchestrator polling script:
[PASTE YOUR SCRIPT]

Expected outcome: AI suggests improvements (e.g., database instead of files, dependency resolution logic). You see the path to larger scale.

Prompt 3: Autonomy Through Criteria

In Session 3, ask about the philosophy:

I'm trying to understand why acceptance criteria matter for autonomous work.

Here are two versions of acceptance criteria for the same feature:

VERSION A (vague):
- The API should be fast
- Error handling should be robust
- Code should be clean

VERSION B (specific):
- All endpoints respond in &lt;100ms (p95)
- Error handling covers 5 scenarios: invalid input, timeout, rate limit, database error, malformed JSON
- Code passes linter, >80% test coverage, docstrings on all public functions

Explain: Why does VERSION B enable autonomous work while VERSION A requires constant oversight?

Expected outcome: AI articulates the principle clearly: Vague criteria require judgment; specific criteria enable verification. This insight unlocks your understanding of why contracts matter.

Real-World Context

Let's ground this in how real organizations use contracts for distributed coordination.

How Anthropic Uses Contracts for Distributed Teams

At Anthropic, distributed teams building Claude use contracts extensively. When one team builds a feature that another team depends on, they don't rely on meetings. Instead:

Team A writes a contract: "We'll provide endpoint X with response schema Y"
Team B writes acceptance criteria: "Endpoint X must respond in <50ms, handle 1000 req/sec"
Both teams agree on the contract (30-min async discussion in GitHub)
Team A implements
When done, Team A pings Team B (no meetings, just a Slack notification)
Team B integrates and verifies against the contract
Done

This happens dozens of times per day across Anthropic. No coordination meetings. No synchronous overhead. Just contracts.

The result: Large distributed teams ship autonomously. The contract is the communication.

How Vercel Coordinates 50+ Engineers

Vercel's Next.js team operates similarly. When you have 50 engineers working on a framework, synchronous coordination is impossible. So they use:

Contracts: Each feature (routing, rendering, data fetching) has an explicit contract with other features
Acceptance criteria: Measurable, testable conditions for "done"
Hooks: CI/CD system automatically notifies when features are complete
Async integration: Features merge when they meet contracts, no integration meetings

The result: 50 engineers ship features in parallel, integrating hundreds of times per week, with minimal sync meetings.

How Solo Founders Use Contracts to Coordinate AI Agents

You (the solo founder) don't have a team. But you're about to coordinate 5-10 AI agents building features. Same principle applies.

You write a contract: "Build a cart feature that calls the product API I defined." AI agents read the contract and implement. Hooks notify you when done. You integrate.

The difference: With clear contracts, you can coordinate 10 AI agents as efficiently as Vercel coordinates 50 humans. Decomposition + Contracts = Unlimited Scale.

🤔 Pause and Reflect: You've learned that contracts reduce coordination overhead. Ask yourself: If contracts cut coordination time in half, what would you do with that freed time? Strategy? Innovation? Delegation? What becomes possible when you're not constantly checking on agents?

From Scaling Problems to Contract Solutions​

Why Contracts Enable Autonomy​

The Monitoring Trap​

The Contract Solution​

Why This Works​

Writing Integration Contracts​

Contract Structure: The Template​

Defining Acceptance Criteria​

The Vagueness Problem​

The Measurable Solution​

Code Example 2: Acceptance Criteria Table​

Configuring Completion Hooks​

Hook Basics: What Triggers, What Fires​

Code Example 3: Completion Hook Script​

Code Example 4: Orchestrator Polling Script​

Exercises: Build Your Contract Intuition​

Exercise 1: Write a 3-Feature Contract​

Exercise 2: Define Acceptance Criteria​

Exercise 3: Configure Hooks in a Test Worktree​

Exercise 4: Simulate Parallel Coordination​

Try With AI​

Prompt 1: Contract Quality Check​

Prompt 2: Scaling Hooks​

Prompt 3: Autonomy Through Criteria​

Real-World Context​

How Anthropic Uses Contracts for Distributed Teams​

How Vercel Coordinates 50+ Engineers​

How Solo Founders Use Contracts to Coordinate AI Agents​