AI

CCA-F Study Day 12/20: CI/CD Integration & Batch Processing

Domain 3: Claude Code Configuration & Workflows (~20% of exam)


πŸ“Œ Today's Focus

Yesterday you explored custom commands, skills, and plan mode. Today we tackle the production deployment side of Claude Code β€” how to integrate it into CI/CD pipelines, run it headlessly, process work in batches, and implement multi-pass review patterns with session isolation. This maps directly to Exam Scenario 5: "Claude Code for CI/CD", which specifically tests -p flag usage, structured output, batch API, and multi-pass review. If this scenario appears on your exam, today's material is your lifeline.

πŸ“š Core Concepts

1. The -p Flag: Non-Interactive (Headless) Mode

The -p (or --print) flag is the gateway to CI/CD integration. It runs Claude Code in non-interactive modeβ€” it processes a prompt, outputs the result to stdout, and exits. No user interaction, no approval prompts, no interactive UI.

# Basic non-interactive usage
claude -p "Review this PR for security issues" --output-format json

# Pipe content in
cat error.log | claude -p "Diagnose the root cause of these errors"

# With structured JSON output for downstream parsing
claude -p "Analyze test coverage" --output-format json > analysis.json

Key flags for CI/CD:

  • -p / --print β€” Non-interactive mode (required for CI/CD)
  • --output-format json β€” Structured JSON output for pipeline parsing
  • --output-format stream-json β€” Streaming JSON for real-time processing
  • --allowedTools β€” Restrict which tools the agent can use (security!)
  • --model β€” Specify which Claude model to use
  • --max-turns β€” Limit iteration count in pipelines

Why this matters: The exam will present scenarios where you need to choose between interactive and non-interactive modes. The answer is ALWAYS -p for CI/CD β€” interactive mode in automation is an anti-pattern.

2. CI/CD Pipeline Patterns

Pattern A: GitHub Actions PR Review

# .github/workflows/ai-review.yml
name: AI Code Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: AI Security Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          claude -p "Review the changes in this PR. Focus on security, 
          performance, and code quality. Output as structured JSON with 
          fields: issues (array), severity (high/medium/low), suggestion." \
          --output-format json > review.json
      
      - name: Parse and Comment
        run: |
          # Parse structured output and post as PR comment
          python scripts/post_review.py review.json

Pattern B: Fan-Out Migration (Parallelism)

# Process multiple files in parallel with isolated agents
find src -name "*.js" | xargs -P 4 -I {} claude -p "Convert {} to TypeScript"

# Each file gets its own Claude instance (session isolation!)
# -P 4 = 4 parallel workers

Pattern C: Pipeline Chaining

# Sequential pipeline β€” output of one becomes input of next
cat data.json | claude -p "Extract entities" | claude -p "Validate and enrich"

# Each pipe creates a SEPARATE session (critical for exam!)

Pattern D: Pre-commit Hooks

# .pre-commit-hooks.yaml
- repo: local
  hooks:
    - id: ai-lint
      name: AI Code Lint
      entry: claude -p "Check this file for common bugs and style issues. 
             Report only critical issues." --output-format json
      language: system
      types: [python]

3. Session Isolation: The Generator-Reviewer Pattern in CI

This is a critical exam concept that combines Domain 1 (multi-agent patterns) with Domain 3 (CI/CD). The exam WILL test whether you understand why generator and reviewer must be in separate sessions.

# ❌ ANTI-PATTERN: Same session reviews its own output
claude -p "Write a function to parse CSV, then review it for bugs"
# The reviewer has the generator's reasoning context = BIAS

# βœ… CORRECT: Separate sessions for generator and reviewer  
# Step 1: Generate (Session A)
claude -p "Write a function to parse CSV files. Output only code." \
  --output-format json > generated_code.json

# Step 2: Review (Session B β€” completely fresh context!)
cat generated_code.json | claude -p "Review this code for bugs, 
  security issues, and edge cases. You did NOT write this code." \
  --output-format json > review.json

# Step 3: Parse results
python ci/parse_review.py review.json

Why separate sessions? When the same session generates AND reviews, the model has access to its own reasoning chain. It's biased toward confirming its own decisions ("reasoning context bias"). A fresh session sees only the output β€” no justifications, no "I chose X because Y" context β€” so it evaluates objectively.

4. The Message Batches API

For processing large volumes of requests asynchronously at 50% cost reduction. This is different from the -p flag β€” it's a server-side API for bulk processing.

import anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

client = anthropic.Anthropic()

# Create a batch of requests
message_batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id="review-file-1",
            params=MessageCreateParamsNonStreaming(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=[{"role": "user", "content": "Review auth.py for security issues"}],
            ),
        ),
        Request(
            custom_id="review-file-2",
            params=MessageCreateParamsNonStreaming(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=[{"role": "user", "content": "Review db.py for SQL injection"}],
            ),
        ),
        # ... up to 100,000 requests per batch!
    ]
)

print(f"Batch ID: {message_batch.id}")
print(f"Status: {message_batch.processing_status}")  # "in_progress"

Batch API Key Facts (exam-relevant):

  • 50% cost reduction vs standard API pricing
  • Up to 100,000 requests per batch (or 256 MB, whichever first)
  • Most batches complete in under 1 hour
  • Results available for 29 days after creation
  • Each request is processed independently (isolation!)
  • Supports vision, tool use, system messages, multi-turn conversations
  • Does NOT support: streaming, fast mode, threads, max_tokens=0

Polling for Results:

# Poll until complete
import time

while True:
    batch = client.messages.batches.retrieve(message_batch.id)
    if batch.processing_status == "ended":
        break
    time.sleep(10)

# Retrieve results
for result in client.messages.batches.results(message_batch.id):
    if result.result.type == "succeeded":
        print(f"{result.custom_id}: {result.result.message.content[0].text}")
    elif result.result.type == "errored":
        print(f"{result.custom_id}: ERROR - {result.result.error}")

5. Multi-Pass Code Review Pattern

This combines batch processing with session isolation for comprehensive code review:

# Multi-pass review: each pass is a SEPARATE session with a DIFFERENT focus
passes = [
    {"id": "security", "prompt": "Review ONLY for security vulnerabilities: injection, auth bypass, data exposure."},
    {"id": "performance", "prompt": "Review ONLY for performance issues: O(nΒ²) algorithms, memory leaks, unnecessary I/O."},
    {"id": "correctness", "prompt": "Review ONLY for logic errors: off-by-one, null handling, race conditions."},
]

code = open("pr_diff.patch").read()

requests = []
for p in passes:
    requests.append(Request(
        custom_id=p["id"],
        params=MessageCreateParamsNonStreaming(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            messages=[{"role": "user", "content": f"{p['prompt']}\n\n```\n{code}\n```"}],
        ),
    ))

# Submit as batch (50% cost savings!)
batch = client.messages.batches.create(requests=requests)

🚫 Anti-Patterns & Exam Traps

❌ Anti-Pattern βœ… Correct Approach Why It's Wrong
Using interactive mode in CI/CD Use -p flag for non-interactive execution Interactive mode blocks on user approval prompts β€” pipeline hangs
Same-session self-review Separate sessions for generator and reviewer Reasoning context bias β€” model confirms its own decisions
Processing files sequentially when order doesn't matter Fan-out with xargs -P or Batch API Wastes time; parallel processing is faster and each file is independent
Parsing natural language output in CI Use --output-format json for structured output Natural language is ambiguous and breaks pipeline parsing
Running without --allowedTools in CI Explicitly restrict tools in automated environments Security risk β€” agent could execute arbitrary commands
Using Batch API for time-sensitive operations Use synchronous API for immediate responses Batch API is async β€” no guaranteed completion time (up to 24h)
Streaming in batch requests Use non-streaming params in batch Batch results come as a file, not a stream β€” stream: true is not supported

πŸ’» Code Example: Complete CI/CD Pipeline

#!/usr/bin/env python3
"""Complete CI/CD integration: generate, review (separate session), gate."""
import json
import subprocess
import sys

def run_claude(prompt: str, allowed_tools: list[str] = None) -> dict:
    """Run Claude Code in non-interactive mode, return structured output."""
    cmd = ["claude", "-p", prompt, "--output-format", "json"]
    if allowed_tools:
        cmd.extend(["--allowedTools", ",".join(allowed_tools)])
    
    result = subprocess.run(cmd, capture_output=True, text=True)
    return json.loads(result.stdout)

def main():
    # STEP 1: Generate (Session A)
    generation = run_claude(
        "Implement the function described in TODO.md. Output only the code.",
        allowed_tools=["Read", "Write"]
    )
    
    # STEP 2: Review (Session B β€” separate session = no reasoning bias!)
    review = run_claude(
        f"Review this code for bugs and security issues. "
        f"Return JSON with fields: pass (bool), issues (array of strings).\n\n"
        f"Code:\n{generation['result']}",
        allowed_tools=["Read"]  # Read-only for reviewer!
    )
    
    # STEP 3: Gate the pipeline
    review_result = json.loads(review["result"])
    if not review_result.get("pass", False):
        print("❌ Code review FAILED:")
        for issue in review_result.get("issues", []):
            print(f"  - {issue}")
        sys.exit(1)
    
    print("βœ… Code review PASSED")
    sys.exit(0)

if __name__ == "__main__":
    main()

🎬 Video to Watch

Claude Code Advanced Patterns: Subagents, MCP, and Scaling to Real Codebases (Anthropic Webinar, March 2026)

This official Anthropic webinar covers the patterns that teams use to ship with Claude Code daily β€” including CI/CD integration, subagent orchestration, and scaling patterns. Pay special attention to the sections on headless mode deployment and multi-pass review with session isolation. This directly maps to today's exam material.

πŸ“– Reading

πŸ› οΈ Hands-On Exercise (20-30 minutes)

Build a 2-stage CI pipeline script:

  1. Create a simple Python file with an intentional bug (e.g., off-by-one error in a loop)
  2. Write a shell script that: 
    • Stage 1: Uses claude -p to generate a fix (Session A)
    • Stage 2: Pipes the fix to a SEPARATE claude -p invocation for review (Session B)
    • Parses the JSON output to determine pass/fail
    • Exits with code 0 (pass) or 1 (fail)
  3. Bonus: Write a GitHub Actions YAML that wraps your script

The key learning: you'll see how session separation works in practice. Each claude -p invocation is a fresh session with no shared memory.

πŸ“ Quick Quiz

Q1: A team wants to use Claude Code to review PRs automatically in their CI pipeline. Which approach is correct?

A) Run claude interactively and pipe "yes" to approval prompts B) Use claude -p "Review this PR" --output-format json with --allowedTools Read C) Use the Batch API to submit the PR for review D) Run Claude Code with --bypassPermissions for faster execution

Q2: Why must the generator and reviewer run in separate sessions in a CI pipeline?

A) Because Claude Code doesn't support multiple tool calls per session B) To reduce token costs by splitting the context window C) To avoid reasoning context bias β€” the reviewer would see the generator's justifications D) Because the Batch API requires separate custom_ids for each request

Q3: A company needs to review 5,000 code files for security vulnerabilities. The review is not time-sensitive and can run overnight. What's the BEST approach?

A) Use xargs -P 100 with claude -p for maximum parallelism B) Submit all 5,000 as a Message Batch for 50% cost savings C) Run them sequentially to avoid rate limits D) Use interactive mode with a script that auto-approves


Answers:

Q1: B β€” The -p flag is the correct approach for CI/CD. Option A is hacky and fragile. Option C (Batch API) is for async bulk processing, not real-time PR reviews. Option D is a security nightmare in production CI.

Q2: C β€” Reasoning context bias is the key issue. When the reviewer sees the generator's "thinking" (why it chose certain approaches), it's biased toward agreement. A fresh session evaluates the code on its merits alone.

Q3: B β€” The Message Batches API is designed exactly for this: large volume, not time-sensitive, cost-optimized (50% off). Option A would hit rate limits at 100 concurrent. Option C is wasteful. Option D is an anti-pattern.

πŸ‘€ Tomorrow's Preview

Day 13 dives into Permissions, Settings & Hooks in Claude Code β€” the settings.json configuration system, permission models (allowedTools/blockedTools), and how hooks work specifically in the Claude Code context (pre/post command hooks for iterative refinement). This rounds out Domain 3 before we move to prompt engineering.