CCA-F Study Day 5/20: Task Decomposition & Exam Scenarios

Domain 1: Agentic Architecture & Orchestration (~25-27% of exam)

📌 Today's Focus

Today is the capstone day for Domain 1 — the largest-weighted domain on the entire exam. You're taking everything from Days 1-4 (agentic loop, effort levels, orchestration patterns, hooks) and synthesizing it into how to design complete agent systems for real scenarios. The exam tests this through two main scenarios: Customer Support Resolution Agent and Multi-Agent Research System. You need to know the correct architectural decisions for each — and more importantly, you need to recognize the plausible-but-wrong answers the exam presents.

📚 Core Concepts

1. Task Decomposition — Breaking Problems into Agent-Sized Work

Task decomposition is the art of splitting complex objectives into units that a single agent (or agent turn) can handle within its context window, tool set, and authority boundary.

The Three Dimensions of Decomposition:

Complexity scope: Can a single agent handle this in one session? If not, split into subagents.
Authority boundary: Does this require different permission levels? Separate by privilege.
Domain expertise: Does this require different tool sets? Route to specialist agents.

Key principles:

Each subtask should be completable within a single agent's context window
Subtasks should have clear success/failure criteria (not vague "do your best")
Interfaces between subtasks pass artifacts, not raw conversation history
Error boundaries prevent one subtask's failure from cascading to others

2. Context Isolation — What Each Agent Sees vs. What It Doesn't

This is a critical exam concept. Each agent operates within its own context window. When you spawn a subagent, it gets:

✅ Its own system prompt and tools
✅ Specific context you pass to it (task description, relevant data)
❌ NOT the full conversation history of the parent
❌ NOT other subagents' outputs (unless explicitly passed)

Why this matters: Context isolation prevents reasoning contamination. A verifier agent that can see the generator's reasoning will be biased toward confirming it. Separate sessions = independent evaluation.

# Context isolation in a hub-and-spoke pattern coordinator_context = { "task": "Research quantum computing applications in drug discovery", "subtasks": ["literature_review", "patent_analysis", "expert_interviews"] } # Each subagent gets ONLY its specific task + minimal context literature_agent_context = { "task": "Find peer-reviewed papers on quantum computing in molecular simulation", "constraints": "Published 2022-2026, focus on protein folding applications", "output_format": "structured_summary" } # The literature agent does NOT see patent_analysis or expert_interviews tasks

3. Error Boundaries — How Failures Propagate (or Don't)

In multi-agent systems, you need to decide: when one agent fails, what happens to the others?

Three error propagation strategies:

Strategy	Behavior	Use When
Isolated	Failure stays within the failing agent	Independent tasks (fan-out research)
Circuit Breaker	After N failures, stop the entire pipeline	Sequential pipelines where downstream depends on upstream
Graceful Degradation	Continue with partial results, flag gaps	Research/analysis where some data is better than none

class ErrorBoundary: def __init__(self, strategy="isolated", max_failures=3): self.strategy = strategy self.failure_count = 0 self.max_failures = max_failures def handle_failure(self, agent_id, error): self.failure_count += 1 if self.strategy == "circuit_breaker" and self.failure_count >= self.max_failures: raise SystemHaltError(f"Circuit breaker tripped after {self.max_failures} failures") if self.strategy == "graceful_degradation": return {"partial": True, "agent_id": agent_id, "error": error.category} # "isolated" — log and continue, other agents unaffected log_failure(agent_id, error) return None

4. Information Provenance Tracking

When multiple agents contribute to a final output, you need to track where each piece of information came from. This is tested in the Multi-Agent Research scenario.

# Each subagent result includes provenance metadata research_result = { "findings": [...], "sources": [ {"agent": "literature_agent", "source_type": "peer_reviewed", "confidence": "high"}, {"agent": "web_agent", "source_type": "news_article", "confidence": "medium"} ], "timestamp": "2026-05-20T10:30:00Z", "session_id": "sess_abc123" }

🎭 Exam Scenarios — How to Approach Them

Scenario 1: Customer Support Resolution Agent

This scenario tests your ability to design an agent that handles customer tickets with:

Tool use (CRM lookup, order history, knowledge base search)
Escalation logic (WHEN to hand off to a human)
Hooks (compliance enforcement, PII filtering)
MCP integration (connecting to external systems)

Correct Escalation Design:

# Escalation is based on STRUCTURED CRITERIA, never sentiment escalation_triggers = { "policy_gap": lambda ctx: ctx.knowledge_base_result.get("coverage") == "none", "multi_system_failure": lambda ctx: ctx.failed_tool_count >= 2, "complexity_threshold": lambda ctx: ctx.required_tools > 5, "authority_boundary": lambda ctx: ctx.requested_action in ["refund_over_500", "account_deletion"], } def should_escalate(context): """Programmatic escalation check — called via hook, not prompt instruction.""" for trigger_name, check in escalation_triggers.items(): if check(context): return EscalationResult( trigger=trigger_name, context=context.summary, recommended_team=route_to_team(trigger_name) ) return None

Scenario 2: Multi-Agent Research System

This scenario tests your ability to design a hub-and-spoke system where:

A coordinator plans research tasks
Specialist subagents execute in parallel (fan-out)
Results are aggregated with provenance (fan-in)
Error handling prevents one failed search from killing the whole system

Correct Architecture:

# Multi-Agent Research System Architecture class ResearchCoordinator: def __init__(self): self.subagents = { "literature": LiteratureAgent(tools=["search_papers", "read_pdf"]), "patents": PatentAgent(tools=["search_patents", "extract_claims"]), "web": WebResearchAgent(tools=["web_search", "fetch_page"]), } self.error_boundary = ErrorBoundary(strategy="graceful_degradation") async def research(self, query): # 1. PLAN: Coordinator decomposes the query plan = await self.plan_research(query) # 2. FAN-OUT: Dispatch to subagents in parallel tasks = [] for task in plan.subtasks: agent = self.subagents[task.specialist] tasks.append(self.run_with_boundary(agent, task)) results = await asyncio.gather(*tasks, return_exceptions=True) # 3. FAN-IN: Aggregate results, track provenance aggregated = self.aggregate_with_provenance(results) # 4. SYNTHESIZE: Coordinator creates final report return await self.synthesize(aggregated) async def run_with_boundary(self, agent, task): try: return await agent.execute(task) except AgentError as e: return self.error_boundary.handle_failure(agent.id, e)

🚨 Anti-Patterns & Exam Traps

The exam will present these as plausible answers. They are ALL WRONG:

❌ Wrong Answer	✅ Correct Approach	Why It's Wrong
Escalate based on customer sentiment	Escalate based on structured criteria (policy gap, complexity threshold)	Sentiment is unreliable; a calm customer with a complex issue needs escalation more than an angry customer with a simple fix
Use self-reported confidence for escalation	Use programmatic checks against defined criteria	Model confidence scores are uncalibrated and not reliable decision signals
Share full conversation history with all subagents	Pass only relevant context via structured artifacts	Full history wastes context window space and introduces reasoning bias
Same session for research and verification	Separate sessions for generation and verification	Reasoning context bias — the verifier confirms rather than critically evaluates
Retry failed agent infinitely	Use circuit breaker pattern with max failure count	Infinite retries waste budget and may indicate a systemic issue
Route based on keywords in user message	Route based on task decomposition + tool requirements	Keywords are fragile; the same keyword can mean different things in context

💻 Full Code Example: Customer Support Agent

import anthropic import json # 1. Tool definitions (4-5 per agent, well-described) support_tools = [ { "name": "lookup_customer", "description": "Retrieve customer profile by email or ID. Returns: name, plan tier, " "account age, past tickets. Use when identifying the customer.", "input_schema": { "type": "object", "properties": { "identifier": {"type": "string", "description": "Email or ID (format: CUST-XXXXX)"} }, "required": ["identifier"] } }, { "name": "search_knowledge_base", "description": "Search internal KB for policy articles. Returns: matching articles " "with relevance scores. Use for policy/procedure questions.", "input_schema": { "type": "object", "properties": { "query": {"type": "string"}, "category": {"type": "string", "enum": ["billing", "technical", "account", "shipping"]} }, "required": ["query"] } }, { "name": "check_order_status", "description": "Get current order status. Returns: status, items, tracking, delivery ETA.", "input_schema": { "type": "object", "properties": { "order_id": {"type": "string", "description": "Order ID (format: ORD-XXXXX)"} }, "required": ["order_id"] } }, { "name": "escalate_to_human", "description": "Transfer to human agent. Use ONLY when programmatic criteria are met.", "input_schema": { "type": "object", "properties": { "reason": {"type": "string", "enum": ["policy_gap", "multi_tool_failure", "authority_required", "complexity_exceeded"]}, "context_summary": {"type": "string"}, "priority": {"type": "string", "enum": ["normal", "high", "urgent"]} }, "required": ["reason", "context_summary", "priority"] } } ] # 2. Hook for compliance (DETERMINISTIC) async def compliance_hook(event_data): tool_name = event_data.get("tool_name") tool_input = event_data.get("tool_input", {}) if tool_name == "process_refund": amount = tool_input.get("amount", 0) if amount > 500: return { "hookSpecificOutput": { "hookEventName": "PreToolUse", "permissionDecision": "deny", "permissionDecisionReason": f"Refund ${amount} exceeds limit. Escalate." } } return {} # 3. Escalation logic (PROGRAMMATIC) def check_escalation(agent_context): criteria = { "policy_gap": agent_context.kb_returned_no_results, "multi_tool_failure": agent_context.consecutive_failures >= 2, "authority_required": agent_context.action_needs_approval, "complexity_exceeded": len(agent_context.tools_used) > 4, } for reason, triggered in criteria.items(): if triggered: return {"escalate": True, "reason": reason} return {"escalate": False}

🎬 Video to Watch

Claude Code Advanced Patterns: Subagents, MCP, and Scaling to Real Codebases (Anthropic Webinar, March 2026)

Focus on: subagent spawning patterns, context passing between coordinator and subagents, and error handling across agent boundaries.

Also read: How We Built Our Multi-Agent Research System — Anthropic's own implementation of the exact Multi-Agent Research scenario the exam tests.

📖 Reading

Primary: Building Effective Agents — Focus on orchestration pattern selection criteria
Secondary: Harness Design for Long-Running Apps — Anthropic's initializer agent decomposing specs into task lists
Exam Guide: Scenario descriptions

🛠️ Hands-On Exercise (25 minutes)

Design a Customer Support Escalation System:

Write Python for a support agent with 4 tools
Implement check_escalation() with 3+ programmatic criteria (NO sentiment)
Add a PreToolUse hook blocking one dangerous operation
Handle: what happens when lookup_customer returns a 429 rate limit?

Bonus: Sketch a second agent as a "response quality checker" that reviews replies before sending. Remember: separate session!

📝 Quick Quiz

Q1: A customer support agent needs to decide whether to escalate. Which approach is correct?
A) System prompt: "If the customer seems frustrated, escalate to a human"
B) Ask Claude: "On a scale of 1-10, how confident are you?"
C) Programmatic hook checking: KB returned no results AND 2+ tool failures
D) Sentiment analysis on the customer's message to detect anger

Q2: How should a coordinator pass information to subagents?
A) Share the coordinator's full conversation history
B) Pass only the specific subtask description and constraints via structured artifact
C) Let subagents access a shared memory store with coordinator's reasoning
D) Use the same session ID so subagents inherit context automatically

Q3: In a Generator-Verifier pattern, what is the critical requirement?
A) The verifier must use a different model than the generator
B) The verifier must run in a separate session to avoid reasoning context bias
C) The verifier must access the generator's chain-of-thought
D) The verifier must use a higher effort level

Answers: Q1: C | Q2: B | Q3: B

Q1: Programmatic checks with structured criteria. A/D are sentiment-based (anti-pattern). B uses self-reported confidence (uncalibrated).
Q2: Only relevant context via artifacts. A wastes context + introduces bias. C creates coupling. D is incorrect — subagents get their own sessions.
Q3: Separate sessions prevent reasoning context bias. C is exactly wrong — seeing reasoning causes confirmation bias.

👀 Tomorrow's Preview

Tomorrow starts Domain 2: Tool Design & MCP Integration — the tool use lifecycle, description best practices, and why 4-5 tools per agent is the sweet spot.