Tasks Phase - Atomic Work Units and Checkpoints

You now have:

✅ A clear specification
✅ A detailed implementation plan
✅ Documented architecture decisions (ADRs)

Next: Break the plan into atomic work units (tasks) that you'll implement.

This lesson teaches the checkpoint pattern-the critical workflow practice that keeps YOU in control. The pattern is:

Agent: "Here's Phase 1 code"
You: "Review... looks good!"
You: "Commit to git"
You: "Tell me what's next"
Agent: "Phase 2"

NOT:

Agent: "Here's everything" (no human control)

The difference is huge. Checkpoints keep you in control and catch issues early.

What Are Tasks?

A task is a unit of work that:

Takes 1-2 hours to complete
Has a single, clear acceptance criterion
Depends on specific other tasks
Can be reviewed and approved individually

Task Properties

Size: 2-5 minutes

Too small (>1 minute) = too many micro-tasks
Too large (30+ minutes) = hard to review, hard to fix if wrong

Criterion: Single, testable

"Write add operation" ✅
"Write add operation and all tests" ❌ (two things)
"Write something" ❌ (untestable)

Independence: Can be reviewed individually

Doesn't require other tasks to be done first
Or clearly depends on specific other tasks

💬 AI Colearning Prompt

"Why are tasks sized at 1-2 hours instead of larger chunks like 'full day' or smaller chunks like '15 minutes'? What's the advantage of this granularity for checkpoint-driven development?"

The Checkpoint Pattern (CRITICAL)

This is the most important concept in this lesson. The checkpoint pattern is how you maintain control of the workflow.

Pattern Definition

Loop:
Agent: "I've completed Phase X"
Human: "Review the work"
Human: "APPROVE" → Commit to git
Human: "Tell me next step"

Why Checkpoints Matter

Without Checkpoints (dangerous):

You: "Build my calculator"
Agent: "Done! 5000 lines of code, 47 files. All automated. You're welcome."
You: "Uh... wait, I need to review this..."
Agent: "Too late, already committed and deployed!"

With Checkpoints (controlled):

You: "Start implementation"
Agent: "Phase 1 (Core Operations) complete. 200 lines, ready for review."
You: "Read code... looks good. Commits. What's next?"
Agent: "Phase 2 (Tests) starting"
You: "Review tests... found a bug in edge case handling"
You: "Tell agent, agent fixes, re-reviews, commits"
Agent: "Phase 3..."

Your Role in Each Checkpoint

Step 1: Human Reviews

Read the generated code/tests
Ask: "Does this match the spec?"
Ask: "Are there bugs or edge cases missed?"
Ask: "Is the code understandable?"

Step 2: Human Decides

Approve ("Looks good, commit")
Reject ("Fix this issue")
Request clarification ("Explain this code")

Step 3: Human Directs

"What's next?"
You initiate next phase
Agent doesn't autonomously continue

Generating Your Tasks

Step 1: Run /sp.tasks

In Claude Code, from your calculator-project directory:

/sp.tasks

My calculator specification is at specs/calculator/spec.md  
My implementation plan is at specs/calculator/plan.md  

Please decompose the plan into atomic work units (tasks), each ≤ 2 hours,  
testable, reversible, and with clear dependencies.

Use a TDD approach: for each operation (add, subtract, etc.),  
1️⃣ Write RED tests → 2️⃣ Implement → 3️⃣ Refactor.  
Pause after each group for human review before committing.

Also:
- Use Context7 MCP server for documentation lookups.
- Prefer CLI automation where possible.
- Ensure easy rollback and traceability.

Step 2: Review Generated Tasks

The tasks.md should show:

Task 1: [Description] - 1-2 hours - Depends on: Nothing
Task 2: [Description] - 1.5 hours - Depends on: Task 1
Task 3: [Description] - 2 hours - Depends on: Task 1, Task 2
...

Understanding Your Task Breakdown (15 minutes)

Review your tasks and verify:

Dependency Graph

Here's how your calculator tasks depend on each other:

TDD Workflow: 🔴 RED (test) → 🟢 GREEN (implement) → 🔵 REFACTOR/DOCS

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  Task 1: 🔴 Write RED test: add()                              │
│      ↓                                                          │
│  Task 2: 🟢 Implement add()                                    │
│      ↓                                                          │
│  Task 3: 🔴 Write RED test: subtract()                         │
│      ↓                                                          │
│  Task 4: 🟢 Implement subtract()                               │
│      ↓                                                          │
│  Task 5: 🔴 Write RED test: multiply()                         │
│      ↓                                                          │
│  Task 6: 🟢 Implement multiply()                               │
│      ↓                                                          │
│  Task 7: 🔴 Write RED test: divide() + error cases            │
│      ↓                                                          │
│  Task 8: 🟢 Implement divide() + error handling               │
│      ↓                                                          │
│  Task 9: 🔴 Write RED test: power() + edge cases              │
│      ↓                                                          │
│  Task 10: 🟢 Implement power() + edge case handling           │
│      ↓                                                          │
│  Task 11: 🔵 Write documentation + finalize                    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Pattern: Each operation follows RED → GREEN cycle
         Tests MUST exist before implementation

Legend:

🔴 Red tasks = Write failing tests first (TDD)
🟢 Green tasks = Implement code to make tests pass
🔵 Blue tasks = Documentation and polish

Key Insight: Tests MUST exist before implementation. You cannot implement Task 2 (add function) without Task 1 (add tests) being complete. This is the TDD (Test-Driven Development) pattern.

Lineage Traceability

Pick one task. Can you trace it back?

Specification: "Calculator must add two numbers"
  ↓
Plan: "Phase 1: Core Operations - Implement basic arithmetic"
  ↓
Task 1.1: "Implement add(a, b) returning float, handling negative inputs"
  ↓
Acceptance Criterion: "add(5, 3) = 8.0, add(-2, 5) = 3.0, add('5', 3) raises TypeError"

If you can trace this lineage, your tasks are well-connected to your specification.

🎓 Expert Insight

In AI-native development, the checkpoint pattern transforms risk management. Without checkpoints, AI generates 5000 lines of code and you review it all at once (high risk: bugs are expensive to fix). With checkpoints, AI generates 200 lines, you review immediately, catch issues early (low risk: bugs are cheap to fix). Professional teams NEVER skip checkpoints—the cost of catching bugs late (in production) is 100x the cost of catching them at checkpoint review.

🤝 Practice Exercise

Ask your AI: "I've generated tasks for my calculator implementation. Can you review specs/calculator/tasks.md and tell me: (1) Are task sizes appropriate (1-2 hours, single testable criterion)? (2) Is the dependency graph correct (tests before implementation, TDD pattern followed)? (3) Can I trace Task 2 (implement add) back through the plan to the specification? (4) Are there tasks that are too large or too small? Then suggest improvements to the task breakdown."

Expected Outcome: Your AI should validate task granularity (e.g., "Implement all operations" = too large → split into per-operation tasks), confirm TDD pattern (tests exist before implementation), verify lineage traceability (specification → plan → task), and suggest optimal sizing for checkpoint-driven development.

Commit Your Tasks

Commit the generated tasks to git:

/sp.git_commit_pr commit the current work in same branch

Common Mistakes

Mistake 1: Tasks Too Large (8+ Hours)

The Error: "Task: Implement entire calculator (8-16 hours)"

Why It's Wrong: Large tasks hide complexity, delay feedback, and make checkpoints meaningless.

The Fix: Break into atomic units (1-2 hours each):

❌ Large: "Implement all operations"
✅ Atomic: "Implement add()" (30 min), "Implement multiply()" (30 min), "Implement divide() with error handling" (1 hour)

Mistake 2: Ignoring Dependencies

The Error: Planning to implement tests before implementing functions

Why It's Wrong: Tasks have natural dependencies. Tests depend on functions existing.

The Fix: Map dependencies explicitly:

Task 1: Implement add() → Task 2: Test add() (depends on Task 1)
Task 3: Implement divide() → Task 4: Test divide() (depends on Task 3)

Try With AI

Ready to validate your task breakdown and prepare for implementation? Test your tasks:

🔍 Explore Task Atomicity:

"Review my task breakdown at specs/calculator/tasks.md. For each task, evaluate: (1) Is it atomic (does ONE thing with ONE acceptance criterion)? (2) Is it sized right (1-2 hours, not days or minutes)? (3) Can it be tested independently? Identify any tasks that are too large (need splitting) or too small (should be combined)."

🎯 Practice Dependency Analysis:

"Analyze the dependencies in my task list. Are they correct and logical? What's the critical path (minimum sequence to reach 'done')? Which tasks could run in parallel? If I had 3 developers, how would you distribute these tasks? Draw me a dependency graph showing which tasks block others."

🧪 Test Checkpoint Readiness:

"I'm about to implement Task 1: [describe your first task]. Walk me through the checkpoint pattern: (1) What should AI generate? (2) What should I review for (not just 'does it work')? (3) What makes a good commit message? (4) How do I know I'm ready for the next task? Give me a checklist for each checkpoint phase."

🚀 Apply to Your Project:

"I need to break down [describe your project] into atomic tasks. Help me apply the task decomposition principles: Show me how to decompose ONE complex feature (like 'user authentication') into 5-8 atomic tasks with clear dependencies and acceptance criteria. Explain your reasoning for each split."

What Are Tasks?​

Task Properties​

💬 AI Colearning Prompt​

The Checkpoint Pattern (CRITICAL)​

Pattern Definition​

Why Checkpoints Matter​

Your Role in Each Checkpoint​

Generating Your Tasks​

Step 1: Run /sp.tasks​

Step 2: Review Generated Tasks​

Understanding Your Task Breakdown (15 minutes)​

Dependency Graph​

Lineage Traceability​

🎓 Expert Insight​

🤝 Practice Exercise​

Commit Your Tasks​

Common Mistakes​

Mistake 1: Tasks Too Large (8+ Hours)​

Mistake 2: Ignoring Dependencies​

Try With AI​