CCA-F Study Day 6/20: Tool Use Lifecycle & Description Best Practices

Domain 2: Tool Design & MCP Integration (~18-20% of exam)

📌 Today's Focus

Welcome to Domain 2! You've finished the hardest domain (Domain 1, 25% of the exam) — congratulations. Domain 2 covers ~18-20% of the exam and tests how well you understand the mechanics of tool use: how Claude decides which tool to call, what a good tool definition looks like, and how to handle errors when tools fail. Today's material is the bedrock for everything else in this domain (MCP, built-in tools, tool distribution) — nail this and the rest clicks into place.

📚 Core Concepts

1. The Tool Use Lifecycle (6 Steps)

This is the fundamental flow the exam tests. Memorize it:

Define tools — Pass a tools array in your API call with name, description, and input_schema
Claude decides — Based on the user's request + your tool descriptions, Claude decides whether to invoke a tool (or not)
Response with **stop_reason: "tool_use"** — The response contains a tool_use content block with id, name, and input
Your code executes the tool — You run the actual operation (API call, DB query, etc.)
Send result back — Return a tool_result message with the matching tool_use_id
Claude processes — Either calls another tool (stop_reason: "tool_use") or produces a final answer (stop_reason: "end_turn")

Critical exam insight: Claude does NOT execute your tools. Your code executes tools. Claude only requests tool calls. This is a common trick answer — the exam will present options that imply Claude "runs" the function directly.

2. Tool Description Best Practices

Tool descriptions are the single most important factor in whether Claude picks the right tool. The exam tests whether you know what makes a description effective vs. ineffective.

The 5 Rules of Good Tool Descriptions:

#	Rule	Example
1	Use `verb_noun` naming	`search_database`, not `db` or `database_stuff`
2	Say when to use it	"Use when the customer asks about an existing order"
3	Document what it returns	"Returns order status, items array, and shipping info"
4	Constrain every parameter	`"format": "date-time"`, `"enum": ["asc","desc"]`, `"maxLength": 100`
5	Mark required fields explicitly	`"required": ["order_id", "customer_id"]`

3. The 4-5 Tool Rule

Keep to 4-5 tools per agent for optimal selection accuracy. This is a hard exam fact. When you give an agent 18+ tools, Claude's ability to pick the right one degrades. The solution architecture for large tool sets is:

Split tools across specialist subagents (each with 4-5 tools)
Use a coordinator agent that routes to the right specialist
Or use ToolSearch for dynamic, on-demand tool discovery

4. The `tool_choice` Parameter

Controls whether and how Claude uses tools:

Value	Behavior	Use When
`{"type": "auto"}`	Claude decides whether to call a tool (default)	Normal operation
`{"type": "any"}`	Claude MUST call at least one tool	You know a tool call is needed
`{"type": "tool", "name": "X"}`	Claude MUST call tool X specifically	Forcing structured output (Day 15 topic)
`{"type": "none"}`	Claude cannot call any tools	Generating text-only response despite tools being defined

5. Parallel Tool Execution Rules

Claude can request multiple tool calls in a single response. Understanding which tools can run concurrently vs. sequentially:

Read-only tools (Read, Glob, Grep, MCP tools marked read-only): Run concurrently
State-modifying tools (Edit, Write, Bash): Run sequentially to avoid conflicts
Custom tools: Default to sequential. Set readOnlyHint in annotations to enable parallel

⚠️ Anti-Patterns & Exam Traps

❌ Wrong Answer (Exam Trap)	✅ Correct Approach	Why It's Wrong
Generic error message: `"Operation failed"`	Include `isError`, `errorCategory`, `isRetryable`, context	Claude can't reason about recovery without knowing what failed and whether retry is viable
Return empty results as success when a tool fails	Explicitly distinguish "no data found" from "access failed"	Claude will tell the user "no results" when actually the system was down
18+ tools on a single agent	4-5 tools per agent, distributed via subagents	Tool selection accuracy degrades sharply with overload
Vague descriptions: `"Does database operations"`	Specific: when to use, what it returns, parameter constraints	Claude can't differentiate tools with ambiguous descriptions
Missing parameter descriptions in schemas	Document every param with type, format, constraints	Claude guesses formats without guidance, causing validation failures
Claude executes the tool directly	Claude requests a tool call; your code executes it	Fundamental misunderstanding of the lifecycle — this IS a trick answer

💻 Code Examples

Complete Tool Definition (Production Quality)

import anthropic import json client = anthropic.Anthropic() # Production-quality tool definitions tools = [ { "name": "get_customer_order", "description": "Retrieve order details by order ID. Returns order status, " "items list, shipping address, and tracking info. " "Use when a customer asks about an existing order's status or contents.", "input_schema": { "type": "object", "properties": { "order_id": { "type": "string", "description": "Order ID in format ORD-XXXXX (5 digits)", "pattern": "^ORD-\\d{5}$" } }, "required": ["order_id"] } }, { "name": "issue_refund", "description": "Process a refund for an order. Returns refund_id and " "estimated processing time. Use when a customer requests " "a refund AND the order is confirmed eligible (delivered or " "within 30-day window). Requires order_id and reason.", "input_schema": { "type": "object", "properties": { "order_id": { "type": "string", "description": "Order ID in format ORD-XXXXX" }, "reason": { "type": "string", "enum": ["defective", "wrong_item", "not_received", "changed_mind"], "description": "Reason category for the refund" }, "amount_cents": { "type": "integer", "description": "Refund amount in cents (partial refunds). Omit for full refund.", "minimum": 1 } }, "required": ["order_id", "reason"] } }, { "name": "escalate_to_human", "description": "Transfer the conversation to a human agent. Use when: " "(1) policy gap detected, (2) customer requests human, or " "(3) task exceeds complexity threshold. Returns queue position.", "input_schema": { "type": "object", "properties": { "reason": { "type": "string", "description": "Brief explanation of why escalation is needed" }, "priority": { "type": "string", "enum": ["low", "medium", "high", "critical"], "description": "Escalation priority based on issue severity" }, "context_summary": { "type": "string", "description": "Summary of conversation for the human agent (max 500 chars)", "maxLength": 500 } }, "required": ["reason", "priority"] } }, { "name": "check_inventory", "description": "Check real-time stock level for a product SKU. Returns " "quantity_available, warehouse_location, and restock_date " "if currently out of stock. Use before promising availability.", "input_schema": { "type": "object", "properties": { "sku": { "type": "string", "description": "Product SKU (e.g., PROD-AB-12345)" }, "warehouse_region": { "type": "string", "enum": ["us-east", "us-west", "eu", "apac"], "description": "Which warehouse region to check (defaults to nearest)" } }, "required": ["sku"] } } ]

Structured Error Response Pattern

def execute_tool(tool_name: str, tool_input: dict) -> dict: """Execute tool and return structured result.""" try: if tool_name == "get_customer_order": order = db.get_order(tool_input["order_id"]) if order is None: # ✅ "Not found" is NOT an error — it's a valid empty result return { "found": False, "message": f"No order found with ID {tool_input['order_id']}", "suggestion": "Verify the order ID format (ORD-XXXXX)" } return {"found": True, "order": order.to_dict()} except AuthenticationError: # ✅ Auth failure IS an error — structured with category return { "is_error": True, "errorCategory": "authentication", "isRetryable": False, "context": "API credentials expired", "suggestion": "Ask user to re-authenticate" } except RateLimitError as e: # ✅ Rate limit — retryable with timing info return { "is_error": True, "errorCategory": "rate_limit", "isRetryable": True, "retryAfterSeconds": e.retry_after, "context": "Order service rate limit reached" } except TimeoutError: # ✅ Timeout — retryable return { "is_error": True, "errorCategory": "timeout", "isRetryable": True, "context": "Order service did not respond within 5s" } except ValidationError as e: # ✅ Validation — retryable with fix instructions return { "is_error": True, "errorCategory": "validation", "isRetryable": True, "context": str(e), "suggestion": f"Fix input: {e.field} must be {e.constraint}" }

Full Agentic Loop with Tool Execution

def run_agent(user_message: str) -> str: """Complete agentic loop with proper tool handling.""" messages = [{"role": "user", "content": user_message}] response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=4096, tools=tools, messages=messages ) # Loop while Claude wants to use tools while response.stop_reason == "tool_use": # Collect ALL tool calls from this response tool_results = [] for block in response.content: if block.type == "tool_use": # Execute the tool result = execute_tool(block.name, block.input) tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result), "is_error": result.get("is_error", False) }) # Append assistant response + all tool results messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": tool_results}) # Continue the loop response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=4096, tools=tools, messages=messages ) # Extract final text response return "".join(block.text for block in response.content if block.type == "text")

📖 Reading

Primary: Writing Effective Tools for AI Agents — Using AI Agents (Anthropic Engineering Blog, Sep 2025) — This is THE definitive guide on tool design from Anthropic's own engineering team. Covers how tool quality directly impacts agent performance and how to evaluate/improve tools.
Reference: How to Implement Tool Use (Official Docs) — Step-by-step lifecycle implementation guide.
Advanced: Introducing Advanced Tool Use on the Claude Developer Platform (Nov 2025) — Covers scaling to hundreds/thousands of tools, ToolSearch, and programmatic tool calling.

🎬 Video Course

Building with the Claude API (Anthropic Skilljar) — This official Anthropic video course has a dedicated module on tool integration and architectural patterns. Watch the tool use module — it walks through the full lifecycle with code demos. Free to access with a Skilljar account.

🛠️ Hands-On Exercise (20–30 min)

Build a 4-tool customer support agent from scratch:

Define 4 tools: get_customer_order, issue_refund, escalate_to_human, check_inventory
Implement the execute_tool function with mock data (return fake but realistic responses)
Implement structured error responses for at least 3 failure modes (auth, rate limit, not found)
Write the agentic loop that handles multiple sequential tool calls
Test with these prompts:

"What's the status of order ORD-12345?"
"I want a refund for ORD-99999" (non-existent order — tests your not-found handling)
"I need to talk to a human" (tests escalation)

Success criteria: Claude picks the right tool every time, errors are structured (never generic), and the loop terminates cleanly on end_turn.

📝 Quick Quiz

Q1: A developer is building a customer support agent and defines 22 tools covering order management, account settings, shipping, billing, promotions, and FAQ lookup. Users report that Claude frequently calls the wrong tool. What is the BEST architectural fix?

A) Add more detailed descriptions to each tool
B) Use tool_choice: {"type": "any"} to force tool selection
C) Split tools across specialist subagents with 4-5 tools each, coordinated by a hub agent
D) Increase max_tokens to give Claude more room to reason about tool selection

Q2: A tool call to an external API returns an HTTP 429 (rate limited). What should the tool result contain?

A) {"content": "Operation failed", "is_error": true}
B) {"content": ""} (empty string, let Claude figure it out)
C) {"content": "{\"is_error\": true, \"errorCategory\": \"rate_limit\", \"isRetryable\": true, \"retryAfterSeconds\": 30}", "is_error": true}
D) Throw an exception and let the SDK handle it

Q3: In the tool use lifecycle, which component executes the actual tool operation?

A) Claude's inference engine processes the tool call internally
B) The Anthropic API server executes the tool and returns results
C) Your application code receives the tool_use request and executes the operation
D) The MCP server automatically intercepts and runs the tool

Answers

Q1: C — The 4-5 tool rule. When tool count exceeds this threshold, selection accuracy degrades. The architectural fix is hub-and-spoke with specialist subagents. Adding descriptions (A) helps but doesn't solve overload. Forcing selection (B) doesn't fix the underlying confusion. Max tokens (D) is irrelevant to tool selection logic.

Q2: C — Structured error with category, retryability, and timing. Option A is the classic anti-pattern (generic message). Option B silently suppresses the error. Option D bypasses Claude's ability to reason about recovery.

Q3: C — YOUR code executes the tool. Claude only produces a tool_use content block (a request). The API (B) doesn't execute custom tools. Claude's engine (A) can't run external operations. MCP servers (D) only apply in MCP context, not the base tool use lifecycle.

👀 Tomorrow's Preview

Day 7 goes deeper on structured error responses — you'll learn the full error taxonomy, how to design error schemas that enable Claude to self-recover, and the critical distinction between "no data found" vs. "access failed" that the exam loves to test.