CCA-F Study Day 5/20: Task Decomposition & Exam Scenarios
Domain 1: Agentic Architecture & Orchestration (~25-27% of exam)
π Today's Focus
Today is the capstone day for Domain 1 β the largest-weighted domain on the entire exam. You're taking everything from Days 1-4 (agentic loop, effort levels, orchestration patterns, hooks) and synthesizing it into how to design complete agent systems for real scenarios. The exam tests this through two main scenarios: Customer Support Resolution Agent and Multi-Agent Research System. You need to know the correct architectural decisions for each β and more importantly, you need to recognize the plausible-but-wrong answers the exam presents.
π Core Concepts
1. Task Decomposition β Breaking Problems into Agent-Sized Work
Task decomposition is the art of splitting complex objectives into units that a single agent (or agent turn) can handle within its context window, tool set, and authority boundary.
The Three Dimensions of Decomposition:
- Complexity scope: Can a single agent handle this in one session? If not, split into subagents.
- Authority boundary: Does this require different permission levels? Separate by privilege.
- Domain expertise: Does this require different tool sets? Route to specialist agents.
Key principles:
- Each subtask should be completable within a single agent's context window
- Subtasks should have clear success/failure criteria (not vague "do your best")
- Interfaces between subtasks pass artifacts, not raw conversation history
- Error boundaries prevent one subtask's failure from cascading to others
2. Context Isolation β What Each Agent Sees vs. What It Doesn't
This is a critical exam concept. Each agent operates within its own context window. When you spawn a subagent, it gets:
- β Its own system prompt and tools
- β Specific context you pass to it (task description, relevant data)
- β NOT the full conversation history of the parent
- β NOT other subagents' outputs (unless explicitly passed)
Why this matters: Context isolation prevents reasoning contamination. A verifier agent that can see the generator's reasoning will be biased toward confirming it. Separate sessions = independent evaluation.
# Context isolation in a hub-and-spoke pattern coordinator_context = { "task": "Research quantum computing applications in drug discovery", "subtasks": ["literature_review", "patent_analysis", "expert_interviews"] } # Each subagent gets ONLY its specific task + minimal context literature_agent_context = { "task": "Find peer-reviewed papers on quantum computing in molecular simulation", "constraints": "Published 2022-2026, focus on protein folding applications", "output_format": "structured_summary" } # The literature agent does NOT see patent_analysis or expert_interviews tasks
3. Error Boundaries β How Failures Propagate (or Don't)
In multi-agent systems, you need to decide: when one agent fails, what happens to the others?
Three error propagation strategies:
| Strategy | Behavior | Use When |
|---|---|---|
| Isolated | Failure stays within the failing agent | Independent tasks (fan-out research) |
| Circuit Breaker | After N failures, stop the entire pipeline | Sequential pipelines where downstream depends on upstream |
| Graceful Degradation | Continue with partial results, flag gaps | Research/analysis where some data is better than none |
class ErrorBoundary: def __init__(self, strategy="isolated", max_failures=3): self.strategy = strategy self.failure_count = 0 self.max_failures = max_failures def handle_failure(self, agent_id, error): self.failure_count += 1 if self.strategy == "circuit_breaker" and self.failure_count >= self.max_failures: raise SystemHaltError(f"Circuit breaker tripped after {self.max_failures} failures") if self.strategy == "graceful_degradation": return {"partial": True, "agent_id": agent_id, "error": error.category} # "isolated" β log and continue, other agents unaffected log_failure(agent_id, error) return None
4. Information Provenance Tracking
When multiple agents contribute to a final output, you need to track where each piece of information came from. This is tested in the Multi-Agent Research scenario.
# Each subagent result includes provenance metadata research_result = { "findings": [...], "sources": [ {"agent": "literature_agent", "source_type": "peer_reviewed", "confidence": "high"}, {"agent": "web_agent", "source_type": "news_article", "confidence": "medium"} ], "timestamp": "2026-05-20T10:30:00Z", "session_id": "sess_abc123" }
π Exam Scenarios β How to Approach Them
Scenario 1: Customer Support Resolution Agent
This scenario tests your ability to design an agent that handles customer tickets with:
- Tool use (CRM lookup, order history, knowledge base search)
- Escalation logic (WHEN to hand off to a human)
- Hooks (compliance enforcement, PII filtering)
- MCP integration (connecting to external systems)
Correct Escalation Design:
# Escalation is based on STRUCTURED CRITERIA, never sentiment escalation_triggers = { "policy_gap": lambda ctx: ctx.knowledge_base_result.get("coverage") == "none", "multi_system_failure": lambda ctx: ctx.failed_tool_count >= 2, "complexity_threshold": lambda ctx: ctx.required_tools > 5, "authority_boundary": lambda ctx: ctx.requested_action in ["refund_over_500", "account_deletion"], } def should_escalate(context): """Programmatic escalation check β called via hook, not prompt instruction.""" for trigger_name, check in escalation_triggers.items(): if check(context): return EscalationResult( trigger=trigger_name, context=context.summary, recommended_team=route_to_team(trigger_name) ) return None
Scenario 2: Multi-Agent Research System
This scenario tests your ability to design a hub-and-spoke system where:
- A coordinator plans research tasks
- Specialist subagents execute in parallel (fan-out)
- Results are aggregated with provenance (fan-in)
- Error handling prevents one failed search from killing the whole system
Correct Architecture:
# Multi-Agent Research System Architecture class ResearchCoordinator: def __init__(self): self.subagents = { "literature": LiteratureAgent(tools=["search_papers", "read_pdf"]), "patents": PatentAgent(tools=["search_patents", "extract_claims"]), "web": WebResearchAgent(tools=["web_search", "fetch_page"]), } self.error_boundary = ErrorBoundary(strategy="graceful_degradation") async def research(self, query): # 1. PLAN: Coordinator decomposes the query plan = await self.plan_research(query) # 2. FAN-OUT: Dispatch to subagents in parallel tasks = [] for task in plan.subtasks: agent = self.subagents[task.specialist] tasks.append(self.run_with_boundary(agent, task)) results = await asyncio.gather(*tasks, return_exceptions=True) # 3. FAN-IN: Aggregate results, track provenance aggregated = self.aggregate_with_provenance(results) # 4. SYNTHESIZE: Coordinator creates final report return await self.synthesize(aggregated) async def run_with_boundary(self, agent, task): try: return await agent.execute(task) except AgentError as e: return self.error_boundary.handle_failure(agent.id, e)
π¨ Anti-Patterns & Exam Traps
The exam will present these as plausible answers. They are ALL WRONG:
| β Wrong Answer | β Correct Approach | Why It's Wrong |
|---|---|---|
| Escalate based on customer sentiment | Escalate based on structured criteria (policy gap, complexity threshold) | Sentiment is unreliable; a calm customer with a complex issue needs escalation more than an angry customer with a simple fix |
| Use self-reported confidence for escalation | Use programmatic checks against defined criteria | Model confidence scores are uncalibrated and not reliable decision signals |
| Share full conversation history with all subagents | Pass only relevant context via structured artifacts | Full history wastes context window space and introduces reasoning bias |
| Same session for research and verification | Separate sessions for generation and verification | Reasoning context bias β the verifier confirms rather than critically evaluates |
| Retry failed agent infinitely | Use circuit breaker pattern with max failure count | Infinite retries waste budget and may indicate a systemic issue |
| Route based on keywords in user message | Route based on task decomposition + tool requirements | Keywords are fragile; the same keyword can mean different things in context |
π» Full Code Example: Customer Support Agent
import anthropic import json # 1. Tool definitions (4-5 per agent, well-described) support_tools = [ { "name": "lookup_customer", "description": "Retrieve customer profile by email or ID. Returns: name, plan tier, " "account age, past tickets. Use when identifying the customer.", "input_schema": { "type": "object", "properties": { "identifier": {"type": "string", "description": "Email or ID (format: CUST-XXXXX)"} }, "required": ["identifier"] } }, { "name": "search_knowledge_base", "description": "Search internal KB for policy articles. Returns: matching articles " "with relevance scores. Use for policy/procedure questions.", "input_schema": { "type": "object", "properties": { "query": {"type": "string"}, "category": {"type": "string", "enum": ["billing", "technical", "account", "shipping"]} }, "required": ["query"] } }, { "name": "check_order_status", "description": "Get current order status. Returns: status, items, tracking, delivery ETA.", "input_schema": { "type": "object", "properties": { "order_id": {"type": "string", "description": "Order ID (format: ORD-XXXXX)"} }, "required": ["order_id"] } }, { "name": "escalate_to_human", "description": "Transfer to human agent. Use ONLY when programmatic criteria are met.", "input_schema": { "type": "object", "properties": { "reason": {"type": "string", "enum": ["policy_gap", "multi_tool_failure", "authority_required", "complexity_exceeded"]}, "context_summary": {"type": "string"}, "priority": {"type": "string", "enum": ["normal", "high", "urgent"]} }, "required": ["reason", "context_summary", "priority"] } } ] # 2. Hook for compliance (DETERMINISTIC) async def compliance_hook(event_data): tool_name = event_data.get("tool_name") tool_input = event_data.get("tool_input", {}) if tool_name == "process_refund": amount = tool_input.get("amount", 0) if amount > 500: return { "hookSpecificOutput": { "hookEventName": "PreToolUse", "permissionDecision": "deny", "permissionDecisionReason": f"Refund ${amount} exceeds limit. Escalate." } } return {} # 3. Escalation logic (PROGRAMMATIC) def check_escalation(agent_context): criteria = { "policy_gap": agent_context.kb_returned_no_results, "multi_tool_failure": agent_context.consecutive_failures >= 2, "authority_required": agent_context.action_needs_approval, "complexity_exceeded": len(agent_context.tools_used) > 4, } for reason, triggered in criteria.items(): if triggered: return {"escalate": True, "reason": reason} return {"escalate": False}
π¬ Video to Watch
Claude Code Advanced Patterns: Subagents, MCP, and Scaling to Real Codebases (Anthropic Webinar, March 2026)
Focus on: subagent spawning patterns, context passing between coordinator and subagents, and error handling across agent boundaries.
Also read: How We Built Our Multi-Agent Research System β Anthropic's own implementation of the exact Multi-Agent Research scenario the exam tests.
π Reading
- Primary: Building Effective Agents β Focus on orchestration pattern selection criteria
- Secondary: Harness Design for Long-Running Apps β Anthropic's initializer agent decomposing specs into task lists
- Exam Guide: Scenario descriptions
π οΈ Hands-On Exercise (25 minutes)
Design a Customer Support Escalation System:
- Write Python for a support agent with 4 tools
- Implement
check_escalation()with 3+ programmatic criteria (NO sentiment) - Add a PreToolUse hook blocking one dangerous operation
- Handle: what happens when
lookup_customerreturns a 429 rate limit?
Bonus: Sketch a second agent as a "response quality checker" that reviews replies before sending. Remember: separate session!
π Quick Quiz
Q1: A customer support agent needs to decide whether to escalate. Which approach is correct?
A) System prompt: "If the customer seems frustrated, escalate to a human"
B) Ask Claude: "On a scale of 1-10, how confident are you?"
C) Programmatic hook checking: KB returned no results AND 2+ tool failures
D) Sentiment analysis on the customer's message to detect anger
Q2: How should a coordinator pass information to subagents?
A) Share the coordinator's full conversation history
B) Pass only the specific subtask description and constraints via structured artifact
C) Let subagents access a shared memory store with coordinator's reasoning
D) Use the same session ID so subagents inherit context automatically
Q3: In a Generator-Verifier pattern, what is the critical requirement?
A) The verifier must use a different model than the generator
B) The verifier must run in a separate session to avoid reasoning context bias
C) The verifier must access the generator's chain-of-thought
D) The verifier must use a higher effort level
Answers: Q1: C | Q2: B | Q3: B
Q1: Programmatic checks with structured criteria. A/D are sentiment-based (anti-pattern). B uses self-reported confidence (uncalibrated).
Q2: Only relevant context via artifacts. A wastes context + introduces bias. C creates coupling. D is incorrect β subagents get their own sessions.
Q3: Separate sessions prevent reasoning context bias. C is exactly wrong β seeing reasoning causes confirmation bias.
π Tomorrow's Preview
Tomorrow starts Domain 2: Tool Design & MCP Integration β the tool use lifecycle, description best practices, and why 4-5 tools per agent is the sweet spot.