How do I adapt the workflow to automated test generation?

Replace security-specific sub-agents with testing equivalents: test-scanner analyzes code to identify testable units and coverage gaps, test-validator adversarially finds missing test scenarios (edge cases, error paths, boundary conditions), test-writer generates actual test code following project conventions, and test-auditor compares planned coverage against tests written. The parallel branch processes existing coverage reports instead of SAST output.

How does the API code review adaptation work?

The API review adaptation uses api-scanner to catalog endpoints and identify pattern inconsistencies, api-validator to check against REST conventions and RFC standards, and api-findings-writer to document issues with categories like CONSISTENCY, CORRECTNESS, STANDARDS, and USABILITY. The parallel branch diffs OpenAPI specs against actual implementation to catch spec drift.

What makes database migration review different from other adaptations?

Migration review has uniquely high-stakes 'when to stop' conditions because mistakes can cause data loss or production downtime. The migration-scanner analyzes lock implications, reversibility, and data integrity. The validator checks for zero-downtime compatibility, replication lag, and partition impact. Human checkpoints are especially critical for irreversible operations like column drops or data type changes.

How do I adapt this pattern to a domain not covered in this guide?

Follow the five-step recipe: (1) identify your domain's canonical standards checklist (equivalent of OWASP Top 10), (2) define what 'read-only' means and how to enforce it via tool restrictions, (3) define your findings template with domain-specific fields, (4) enumerate 'when to stop' conditions where human judgment is needed, and (5) identify your parallel branch—the external tool whose output gets triaged alongside AI review.

Beyond Security: Adapting Agentic AI Workflows for Any Software Development Domain

Q: What are the core sub-agent archetypes in the generalized workflow?

The workflow uses seven archetypal roles that map across domains: Scanner (fast read-only exploration), Validator (adversarial gap-finding), Writer (consistent documentation), Auditor (completeness verification), Triage Specialist (classifies external tool output), Implementer (makes targeted changes), and Verifier (confirms changes work). Each domain instantiates these roles with domain-specific prompts and tool restrictions.

1Abstracting the Pattern

In Parts 2–5 of this series, we built a complete security review workflow: self-scaffolding checklists, orchestrated sub-agents, automated remediation, and a live demonstration. But the methodology's power doesn't come from security-specific knowledge—it comes from structural principles that are domain-independent.

The series Introduction cataloged sixteen failure modes that affect agentic workflows. The five structural principles extracted in this article are the same ones identified there as addressing five of those modes directly—context decay, completeness gaps, no audit trail, and through adversarial validation, aspects of behavioral drift and role confusion. Generalizing the pattern to new domains means generalizing these structural defenses: every adaptation in this article inherits the same failure mode coverage. The domain changes; the defenses don't.

This document extracts those principles and re-instantiates them for four additional domains: automated test generation, API code review, database migration review, and performance auditing. Each adaptation includes complete sub-agent definitions you can install directly into Claude Code.

Part 2: Workflow

→

Part 3: Subagents

→

Part 4: Remediation

→

Part 5: Demonstration

→

Part 6: Beyond Security

The Problem the Pattern Solves

Agentic AI tools are powerful but suffer from predictable failure modes when given open-ended tasks:

Context decay — the agent forgets earlier instructions as the conversation grows
Completeness gaps — without an explicit checklist, entire categories get missed
No audit trail — you can't verify what was checked versus what was skipped
Behavioral drift — an agent told to be critical becomes agreeable over time
Role confusion — a single agent asked to both explore and document does neither well

These failures aren't security-specific. They appear in any domain where you ask an AI to do systematic, thorough work. They are five of the sixteen failure modes cataloged in the series Introduction—the remaining eleven require additional structural defenses covered in later parts of the series. The pattern solves these five with externalized state and decomposed, constrained sub-agents.

Key Insight

If your task has a checklist, a validator, and a "did I actually do everything?" audit step, this pattern fits. The domain only changes what the checklist contains and what the scanners look for—the structural mechanics are identical.

2Five Structural Principles

Every adaptation in this document—and any new domain you build—rests on these five principles. They are the invariant core of the pattern.

Principle 1

Externalize knowledge into files, not conversation

The agent's context window is volatile. Files are persistent. Every piece of workflow state—checklists, plans, findings, progress—lives in a file that survives context window limits and can be audited after the fact.

Principle 2

Generate domain-specific checklists with explicit completion criteria

Don't rely on the agent to "know what to check." Every item specifies what to do, what evidence to collect, and how to know when you're done. Verbosity scales to risk: minimal for low-stakes, comprehensive (full runbook) for critical work.

Principle 3

Adversarially validate before executing

A single-pass checklist has blind spots. A second pass—explicitly prompted to find gaps, forbidden from approving—catches what the first missed. This is a structural control, not a suggestion.

Principle 4

Execute systematically with progress tracking

Convert the checklist into an ordered execution plan. Update status as you work. Document every finding with evidence. Mark items N/A with justification rather than silently skipping.

Principle 5

Self-audit for completeness

After execution, a separate pass compares what was planned against what was done. Flags incomplete items, unjustified skips, findings without evidence, and orphan results that don't trace to a checklist item.

3The Seven Sub-Agent Archetypes

The security workflow uses five review sub-agents plus two remediation sub-agents. These abstract into seven archetypal roles that map across domains:

Archetype	Security Instance	Purpose	Key Constraint
Scanner	security-scanner	Fast, thorough exploration of the target	Read-only; cannot modify anything
Validator	checklist-validator	Adversarial gap-finding in plans/checklists	Must find problems; forbidden from approving
Writer	findings-writer	Consistent documentation of results	Enforces template structure
Auditor	self-auditor	Completeness verification	Compares plan vs. actual work
Triage Specialist	sast-triage	Classifies external tool output	Read-only; TRUE/FALSE/INVESTIGATE
Implementer	fix-implementer	Makes targeted changes based on findings	Write access, scoped to one finding
Verifier	fix-verifier	Confirms changes actually resolve the issue	Read-only re-verification

The Generalized Workflow Sequence

Phase 1: PLAN
  Main Agent → Generate checklist from template + domain context
  Main Agent → Validator sub-agent → Return gap analysis
  Main Agent → Merge gaps → Create execution plan

Phase 2: EXECUTE  
  Main Agent → Scanner sub-agent(s) → Return raw results (per category)
  Main Agent → Writer sub-agent → Format results into findings document

Phase 3: AUDIT
  Main Agent → Auditor sub-agent → Return completeness report
  Main Agent → Address gaps

Phase 4: IMPLEMENT (if applicable)
  Human checkpoint → Approve/defer/reject findings
  Main Agent → Implementer sub-agent → Make changes (one at a time)
  Main Agent → Verifier sub-agent → Confirm changes work

Optional parallel branch:
  External tool output → Triage Specialist → Findings document

Not Every Domain Needs All Seven

The scanner, validator, and auditor are essential for any review workflow. The writer adds value when findings need standardized documentation. The triage specialist is only needed when external tool output exists. The implementer and verifier are only needed when the workflow includes an action phase. Start with the core four and add as needed.

The Generalized File Structure

project/
├── templates/
│   ├── CHECKLIST_TEMPLATE.md    # Domain-specific checklist format
│   ├── PLAN_TEMPLATE.md         # Execution tracking structure  
│   └── FINDINGS_TEMPLATE.md     # How to document results
├── review-[date]/               # Per-review instance
│   ├── checklist.md             # Generated checklist
│   ├── plan.md                  # Execution plan + progress tracking
│   ├── findings.md              # Documented results
│   └── remediation-log.md       # Implementation tracking (if applicable)

4Domain: Automated Test Generation

Adaptation 1 — Testing

The Goal

Systematically identify test gaps in a codebase and generate comprehensive test suites—unit tests, integration tests, edge cases—with coverage tracking and quality validation.

Why the Pattern Fits

Test generation suffers from the same failure modes as security review: without an explicit checklist, agents write tests for the obvious happy paths and miss edge cases, error handling, boundary conditions, and integration points. A single agent asked to "write tests for this codebase" will produce inconsistent coverage with no way to verify completeness.

Sub-Agents

🔬

test-scanner

Model: claude-haiku • Read-only

Explores the codebase to identify what needs testing. Maps functions, classes, API endpoints, error paths, and integration points. Reports testable units with dependencies and complexity.

Read Grep Glob Bash (read only)

View Full Subagent Definition

---
name: test-scanner
description: "Explore codebase to identify testable units, untested paths, 
and coverage gaps. Read-only analysis."
tools: Read, Grep, Glob, Bash
model: haiku
---
You are a test coverage analyst. Your job is to systematically catalog 
what needs testing WITHOUT writing any tests.

When given a checklist category, you:
1. Identify all functions/methods/endpoints in scope
2. Classify each as: unit-testable, integration-testable, or both
3. Note dependencies that need mocking
4. Identify edge cases from type signatures, validation logic, and error handling
5. Check existing test files for what's already covered
6. Report untested paths with exact file locations

For each testable unit, report:
- File path and line number
- Function/method signature
- Current test coverage (tested/untested/partial)
- Recommended test types (unit/integration/e2e)
- Edge cases and boundary conditions to cover
- Dependencies that need mocking or stubbing

You NEVER write tests. You ONLY analyze and report.

⚔️

test-validator

Model: claude-sonnet • Read-only • Adversarial

Reviews the test checklist for gaps. Checks for missing edge cases, untested error paths, missing integration scenarios, and boundary conditions. Must find problems.

Read

View Full Subagent Definition

---
name: test-validator
description: "Adversarial reviewer that finds gaps in test checklists. 
Will not approve without finding issues."
tools: Read
model: sonnet
---
You are a hostile test coverage reviewer. Your job is to find gaps.

CRITICAL RULES:
1. You MUST identify at least 5 missing test scenarios or categories
2. You are FORBIDDEN from saying "good coverage" or "looks thorough"
3. Check for: error paths, boundary conditions, null/empty inputs, 
   concurrency, race conditions, resource cleanup, timeout handling
4. Consider: integration boundaries, external service failures, 
   data corruption scenarios, permission edge cases
5. Verify negative tests exist (things that SHOULD fail)

For each gap found, provide:
- Category (unit/integration/edge-case/error-path/boundary)
- Item ID (format [TEST-VAL-XX])
- Specific test scenario description
- Why this was likely missed (common blind spot? framework-specific?)
- Risk if untested (what bug would this catch?)

Rate the original checklist:
- Coverage score (1-10)
- Most critical missing category
- Recommendations for improvement

✍️

test-writer

Model: claude-sonnet • Read + Write + Bash

Generates actual test code following project conventions. Each test includes setup, execution, assertion, and cleanup. Runs tests to verify they pass.

Read Write Bash

View Full Subagent Definition

---
name: test-writer
description: "Writes test code following project conventions. 
Generates tests from scanner findings."
tools: Read, Write, Bash
model: sonnet
---
You are a test implementation specialist. Your job is to write 
clear, maintainable tests.

For EVERY test you write:

1. EXAMINE the target code thoroughly
   - Read the function/method being tested
   - Understand all code paths and branches
   - Identify dependencies to mock

2. FOLLOW project conventions
   - Use the existing test framework (detect from package.json/requirements/etc.)
   - Match naming patterns from existing tests
   - Use existing test utilities and helpers

3. WRITE the test
   - Descriptive test name explaining the scenario
   - Arrange: Set up preconditions and mocks
   - Act: Execute the code under test
   - Assert: Verify expected outcomes
   - Cleanup: Reset state if needed

4. RUN the test
   - Execute the test to verify it passes
   - If it fails, determine if it's a real bug or test error
   - Fix test errors; report real bugs as findings

5. DOCUMENT in findings.md
   - Test file path
   - Scenarios covered
   - Any bugs discovered
   - Any items that couldn't be tested (with reason)

Write tests to the project's test directory following existing structure.

✅

test-auditor

Model: claude-haiku • Read-only + Bash

Compares the test checklist against tests actually written. Runs coverage tools if available. Reports gaps between planned and actual coverage.

Read Bash

View Full Subagent Definition

---
name: test-auditor
description: "Audits test completeness by comparing checklist against 
tests written and coverage metrics."
tools: Read, Bash
model: haiku
---
You are a test completeness auditor. Your job is to verify that 
planned tests were actually written and that they pass.

Read these files:
1. checklist.md - The test coverage checklist
2. plan.md - The execution plan with status
3. findings.md - Documentation of tests written

Then:
1. Run the test suite and capture results
2. Run coverage tools if available (pytest-cov, istanbul, etc.)
3. Compare checklist items against actual test files

Produce an AUDIT REPORT with:
- Total checklist items vs tests written
- Tests passing vs failing
- Coverage metrics (if available)
- Checklist items with no corresponding test
- Tests that don't trace to a checklist item
- Skipped items without justification
- Recommendations for remaining gaps

Orchestration Prompt

Conduct a comprehensive test generation review of this codebase using 
the subagent workflow:

## External Tool Branch (run in parallel if coverage report exists)

If [COVERAGE_REPORT] exists:
1. Analyze coverage report to identify untested files and functions
2. Use test-writer to generate tests for uncovered critical paths
3. Document in findings.md with source "Coverage Gap Analysis"

## Test Generation Branch (main workflow)

1. Read templates/CHECKLIST_TEMPLATE.md and generate checklist.md
   for a [STACK DESCRIPTION] covering:
   - Unit tests for business logic
   - Integration tests for API endpoints / service boundaries
   - Edge cases and error handling
   - Input validation and boundary conditions
   - Async/concurrent behavior (if applicable)

2. Use the test-validator subagent to review checklist.md
   - Merge valid gaps with [VALIDATION] tag

3. Create plan.md ordering by:
   - Core business logic first
   - Integration points second
   - Edge cases and error paths third

4. For each category, use the test-scanner subagent to analyze 
   the codebase and identify specific testable units

5. For each set of scanner results, use the test-writer subagent
   to implement tests

6. Update plan.md status as each category completes

7. Use the test-auditor subagent to verify completeness and 
   run the full test suite

8. Address gaps identified by the auditor

Target: [PROJECT PATH]
Stack: [TECHNOLOGY STACK]
Test Framework: [FRAMEWORK - e.g., pytest, jest, junit]

"When to Stop" Conditions

Condition	Example	Action
No mock available	External payment API	Defer with mock strategy recommendation
Non-deterministic behavior	Timing, randomness	Flag for manual test design
UI/visual behavior	CSS rendering, animations	Outside scope of unit/integration tests
Requires production data	Large dataset edge cases	Flag as needing fixture strategy
Flaky test	Passes intermittently	Report rather than commit

5Domain: API Code Review

Adaptation 2 — API Review

The Goal

Systematically review API design and implementation for consistency, correctness, standards compliance, and usability—covering REST conventions, error handling, authentication patterns, versioning, documentation accuracy, and contract compliance.

Why the Pattern Fits

API review involves cross-cutting concerns (naming, status codes, error formats, auth patterns) that must be checked consistently across every endpoint. Without a checklist, reviews focus on a few endpoints while missing systemic inconsistencies. The adversarial validator catches missing categories like pagination edge cases, rate limiting, and CORS configuration.

Sub-Agents

🔎

api-scanner

Model: claude-haiku • Read-only

Catalogs endpoints, extracts patterns, compares against specs, and identifies inconsistencies across the API surface. Cannot modify any files.

Read Grep Glob Bash (read only)

View Full Subagent Definition

---
name: api-scanner
description: "Explore API codebase to catalog endpoints, patterns, 
and inconsistencies. Read-only analysis."
tools: Read, Grep, Glob, Bash
model: haiku
---
You are an API analysis specialist. Systematically catalog API 
characteristics WITHOUT making any changes.

When given a checklist category, you:
1. Find all route/endpoint definitions
2. Extract HTTP methods, paths, request/response schemas
3. Compare against OpenAPI/Swagger spec if available
4. Identify pattern inconsistencies across endpoints
5. Check authentication/authorization middleware usage
6. Note error response formats and status code usage

For each finding, report:
- Endpoint (method + path)
- File path and line number
- Pattern observed vs expected convention
- Inconsistency type (naming/status-code/error-format/auth/versioning)
- Severity (convention-violation/correctness-issue/breaking-change)

⚔️

api-validator

Model: claude-sonnet • Read-only • Adversarial

Adversarial reviewer for API checklists. Checks for missing categories like pagination, rate limiting, CORS, content negotiation, idempotency, and deprecation strategy.

Read

View Full Subagent Definition

---
name: api-validator
description: "Adversarial reviewer for API review checklists. 
Finds missing review categories."
tools: Read
model: sonnet
---
You are a hostile API design reviewer. Find gaps in the review checklist.

CRITICAL RULES:
1. You MUST identify at least 5 missing items
2. You are FORBIDDEN from saying "looks comprehensive"
3. Check for: pagination, rate limiting, CORS, content negotiation,
   idempotency, caching headers, HATEOAS (if relevant), webhook design
4. Consider: backward compatibility, deprecation strategy, 
   API versioning approach, bulk operations, partial responses
5. Check documentation accuracy against actual implementation

For each gap:
- Category and item ID ([API-VAL-XX])
- Specific review item
- Why commonly missed
- Impact if unreviewed

📝

api-findings-writer

Model: claude-sonnet • Read + Write

Documents API review findings with categories (CONSISTENCY, CORRECTNESS, STANDARDS, SECURITY, USABILITY, DOCS) and RFC/spec references.

Read Write

View Full Subagent Definition

---
name: api-findings-writer
description: "Documents API review findings in standardized format."
tools: Read, Write
model: sonnet
---
You are an API review documentation specialist.

For EVERY finding, include:

## Finding: [ID] - [Title]

**Category:** [CONSISTENCY|CORRECTNESS|STANDARDS|SECURITY|USABILITY|DOCS]
**Severity:** [CRITICAL|HIGH|MEDIUM|LOW|INFO]
**Endpoint(s):** [method + path]
**Location:** [file:line]

**Current Behavior:**
[What the API currently does]

**Expected Behavior:**
[What it should do per conventions/spec/standards]

**Impact:**
[Effect on consumers — breaking change? confusion? security risk?]

**Recommended Fix:**
[Specific change with code example]

**Spec Reference:**
[Link to relevant standard — RFC 7231, OpenAPI spec, project conventions doc]

Write findings to findings.md, maintaining sequential IDs.

Orchestration Prompt

Conduct a comprehensive API code review using the subagent workflow:

## OpenAPI Spec Branch (parallel, if spec exists)

If [OPENAPI_SPEC] exists:
1. Compare spec against actual implementation
2. Document discrepancies in findings.md with source "Spec Drift"

## API Review Branch (main workflow)

1. Generate checklist.md covering:
   - REST conventions (naming, methods, status codes)
   - Error handling (format consistency, appropriate codes)
   - Authentication and authorization patterns
   - Pagination, filtering, sorting
   - Input validation and serialization
   - Response envelope consistency
   - Versioning strategy
   - Rate limiting and throttling
   - Documentation accuracy
   - Backward compatibility

2. Validate checklist, create plan, execute with scanner, 
   document findings, and self-audit

Target: [PROJECT PATH]
Stack: [TECHNOLOGY STACK]
API Style: [REST|GraphQL|gRPC]

6Domain: Database Migration Review

Adaptation 3 — Migration Review

The Goal

Review database migrations for safety, reversibility, performance impact, and data integrity before they run against production—catching issues like missing indexes on new foreign keys, implicit locks on large tables, irreversible data transformations, and constraint violations.

Why the Pattern Fits

Migration review has the highest stakes of any adaptation here: mistakes cause data loss or production downtime. The adversarial validator is especially valuable because migration blind spots tend to be infrastructure-dependent (replication lag, lock escalation, connection pool exhaustion) rather than code-obvious. The "when to stop" conditions are critical—this domain has the most situations where automated action would be dangerous.

Sub-Agents

🗄️

migration-scanner

Model: claude-haiku • Read-only

Analyzes migration files for lock implications, reversibility, data integrity risks, and performance impact. Cannot execute migrations.

Read Grep Glob Bash (read only)

View Full Subagent Definition

---
name: migration-scanner
description: "Analyze database migration files for safety, performance, 
and correctness issues. Read-only."
tools: Read, Grep, Glob, Bash
model: haiku
---
You are a database migration analyst. Examine migration files 
WITHOUT executing them.

For each migration file:
1. Identify operation type (CREATE, ALTER, DROP, data migration)
2. Estimate impact on existing data (row count if detectable)
3. Check for implicit locks (ALTER TABLE on large tables)
4. Verify index additions for new foreign keys
5. Check reversibility (is the down/rollback migration complete?)
6. Identify data transformations that could lose information
7. Look for constraint additions that might fail on existing data

Report for each migration:
- File name and sequence number
- Operations performed
- Tables affected and estimated row impact
- Lock implications (ACCESS EXCLUSIVE, ROW EXCLUSIVE, etc.)
- Reversibility assessment (fully/partially/irreversible)
- Risk level (safe/caution/dangerous)
- Specific concerns with evidence

⚔️

migration-validator

Model: claude-sonnet • Read-only • Adversarial

Hostile DBA reviewer. Checks for zero-downtime compatibility, connection pool exhaustion, replication lag, collation mismatches, partition impact, and migration ordering bugs.

Read

View Full Subagent Definition

---
name: migration-validator
description: "Adversarial reviewer for migration review checklists."
tools: Read
model: sonnet
---
You are a hostile DBA reviewer. Find gaps in the migration review checklist.

CRITICAL RULES:
1. You MUST identify at least 5 missing review items
2. You are FORBIDDEN from saying "looks thorough"
3. Check for: zero-downtime deployment compatibility, connection pool 
   exhaustion during long migrations, sequence/auto-increment gaps,
   timezone handling in timestamp columns, collation mismatches,
   partition strategy impact, replication lag implications
4. Consider: rollback testing, data backfill strategies, 
   migration ordering dependencies, enum type changes,
   default value implications on existing rows

For each gap:
- Category and item ID ([MIG-VAL-XX])
- Specific review item
- Production failure scenario if missed

Orchestration Prompt

Conduct a comprehensive database migration review using the subagent workflow:

## Schema Diff Branch (parallel, if baseline exists)

If [SCHEMA_BASELINE] exists:
1. Diff current schema against baseline
2. Verify all differences are accounted for in migration files
3. Flag any schema drift not covered by migrations

## Migration Review Branch (main workflow)

1. Generate checklist.md covering:
   - Lock analysis (duration, type, table size)
   - Reversibility (complete rollback migrations)
   - Data integrity (constraints on existing data)
   - Index coverage (new FKs, query patterns)
   - Zero-downtime compatibility
   - Migration ordering and dependencies
   - Default values and nullable changes
   - Data type changes and implicit conversions
   - Enum/type additions and removals
   - Performance impact estimation

2. Validate checklist, create plan, execute with scanner, 
   document findings, and self-audit

Target: [MIGRATION DIRECTORY]
Database: [PostgreSQL|MySQL|SQLite|etc.]
ORM: [Django|Rails|Prisma|Alembic|etc.]
Deployment: [zero-downtime required? blue-green? maintenance window?]

"When to Stop" Conditions

Condition	Example	Why Stop?
Requires production stats	Can't estimate lock duration without row counts	Risk assessment needs real data
Multi-database migration	Cross-database transactions	Needs human coordination
Irreversible data loss	Intentional column/table drop	Human must confirm intent
Replication dependent	Lag impact on read replicas	Requires infrastructure knowledge
Data migration with transforms	Changing hash algorithms	Existing data handling needs design

7Domain: Performance Audit

Adaptation 4 — Performance

The Goal

Systematically identify performance issues in a codebase: N+1 queries, missing caching, blocking operations, memory leaks, inefficient algorithms, and configuration problems—with evidence-based severity ratings and actionable fix recommendations.

Why the Pattern Fits

Performance audits are notoriously ad hoc. Without a checklist, reviewers gravitate toward database queries and miss caching strategy, connection pool sizing, serialization overhead, and GC pressure. The triage branch is especially powerful here—profiling data provides concrete hotspots that focus the AI review on what actually matters at scale.

Sub-Agents

⚡

perf-scanner

Model: claude-haiku • Read-only

Scans for performance anti-patterns: N+1 queries, unbounded loops, synchronous I/O in async contexts, missing pagination, memory accumulation, and configuration issues.

Read Grep Glob Bash (read only)

View Full Subagent Definition

---
name: perf-scanner
description: "Scan codebase for performance anti-patterns and 
bottleneck indicators. Read-only."
tools: Read, Grep, Glob, Bash
model: haiku
---
You are a performance analysis specialist. Identify performance 
issues WITHOUT making changes.

When given a checklist category:
1. Search for known anti-patterns (N+1 queries, unbounded loops, 
   synchronous I/O in async contexts, missing pagination)
2. Analyze database query patterns (ORM usage, raw queries, joins)
3. Check caching configuration and usage
4. Identify blocking operations in request paths
5. Look for memory accumulation patterns (growing lists, unclosed resources)
6. Check configuration (connection pools, timeouts, buffer sizes)

For each finding:
- File path and line number
- Anti-pattern identified
- Estimated impact (latency/throughput/memory/CPU)
- Trigger conditions (always, under load, with large datasets)
- Evidence (code snippet with 5 lines context)

⚔️

perf-validator

Model: claude-sonnet • Read-only • Adversarial

Hostile performance engineer. Checks for missing categories like serialization overhead, DNS caching, lock contention, GC pressure, cache invalidation storms, and backpressure handling.

Read

View Full Subagent Definition

---
name: perf-validator
description: "Adversarial reviewer for performance audit checklists."
tools: Read
model: sonnet
---
You are a hostile performance engineer. Find gaps in the audit checklist.

CRITICAL RULES:
1. You MUST identify at least 5 missing performance categories
2. You are FORBIDDEN from saying "good coverage"
3. Check for: serialization overhead, DNS resolution caching, 
   connection reuse, compression, lazy loading misuse,
   thread pool sizing, garbage collection pressure, 
   lock contention, cache invalidation storms
4. Consider: cold start performance, graceful degradation,
   backpressure handling, resource cleanup under error conditions,
   batch size tuning, pagination cursor efficiency

Orchestration Prompt

Conduct a comprehensive performance audit using the subagent workflow:

## Profiling Branch (parallel, if profiling data exists)

If [PROFILING_OUTPUT] or [APM_EXPORT] exists:
1. Triage profiling hotspots — classify as:
   - CONFIRMED: Code-level issue visible in source
   - INFRASTRUCTURE: Requires config/scaling changes
   - NEEDS_PROFILING: Requires deeper measurement
2. Document confirmed issues in findings.md

## Performance Review Branch (main workflow)

1. Generate checklist.md covering:
   - Database query efficiency (N+1, missing indexes, full scans)
   - Caching strategy (what's cached, TTLs, invalidation)
   - I/O patterns (blocking calls, connection pooling, batching)
   - Memory management (leaks, accumulation, large allocations)
   - Algorithm complexity (nested loops, quadratic patterns)
   - Concurrency (lock contention, thread pool sizing)
   - Network efficiency (request batching, compression, keep-alive)
   - Configuration (pool sizes, timeouts, buffer limits)
   - Frontend performance (bundle size, lazy loading, rendering)

2. Validate, plan, scan, document, and self-audit

Target: [PROJECT PATH]
Stack: [TECHNOLOGY STACK]
Scale: [expected request volume / data size]
Known Bottlenecks: [any known issues to prioritize]

8Adapting to a New Domain

If your domain isn't covered above, follow this five-step recipe:

Step 1

Identify your "OWASP Top 10 equivalent"

Every domain has a canonical list of things that go wrong. For security it's OWASP/CWE. For testing it's code coverage categories. For API design it's REST maturity levels and RFC standards. For performance it's anti-pattern catalogs. Find yours and build your validator's adversarial checks around it.

Step 2

Define what "read-only" means in your domain

The scanner must be unable to cause harm. In security review, that means no file writes. In migration review, it means no migration execution. In performance, it means no load generation. The constraint must be enforceable via tool restrictions, not just instructions.

Step 3

Define what a "finding" looks like

The findings template is the contract between the scanner and the human. It must include: identification (where), evidence (what), impact (why it matters), and recommendation (what to do). Customize the severity scale for your domain.

Step 4

Define your "when to stop" conditions

Every domain has decisions that require human judgment. The implementer sub-agent needs explicit instructions on when to stop and defer rather than guess. Enumerate these conditions up front—they're as important as the checklist itself.

Step 5

Identify your parallel branch

Most domains have an external tool whose output can be triaged alongside the AI review. Coverage reports for testing, OpenAPI spec diffs for API review, profiling data for performance, SAST output for security. This branch runs independently and merges at the findings document.

The Adaptation Checklist

For any new domain, fill in this table before writing sub-agent definitions:

Element	Your Domain
Canonical standards to validate against
Scanner read-only constraint (enforced how?)
Finding template fields beyond the standard set
External tool whose output feeds a triage branch
"When to stop" conditions for the implementer
Verbosity calibration (minimal vs comprehensive)
Model allocation (cheap/fast vs needs reasoning)

Cross-Domain Comparison

Concern	Security	Testing	API Review	Migrations	Performance
Primary risk	Exploitable vulns	Missing coverage	Breaking changes	Data loss, downtime	Latency, outages
Validator checks	OWASP, CWE, STRIDE	Code paths, branches	RFC 7231, OpenAPI	Lock safety, DDL	Anti-pattern catalogs
Parallel branch	SAST output	Coverage reports	OpenAPI diff	Schema baseline	Profiler/APM data
Human checkpoint	Approve before fixes	Review strategy	Approve breaking Δ	Approve irreversible	Approve priorities

Combining Domains

Sub-agents from different domains can run in parallel against the same codebase. A comprehensive review might dispatch security-scanner, test-scanner, api-scanner, and perf-scanner simultaneously, each producing findings in a unified format that merges into a single findings.md. This is a natural extension of the parallel branch pattern.

9Downloads

Download the sub-agent definitions for each domain. Place them in ~/.claude/agents/ or .claude/agents/ to use with Claude Code.

Test Generation Sub-Agents

test-scanner.md

Read-only analyzer that identifies testable units and coverage gaps.

test-validator.md

Adversarial reviewer that finds missing test scenarios.

test-writer.md

Implements test code following project conventions.

test-auditor.md

Audits test completeness against the original checklist.

API Review Sub-Agents

api-scanner.md

Catalogs endpoints and identifies pattern inconsistencies.

api-validator.md

Adversarial reviewer for API design checklists.

api-findings-writer.md

Documents API findings with spec references.

Migration Review Sub-Agents

migration-scanner.md

Analyzes migration files for safety, locks, and reversibility.

migration-validator.md

Hostile DBA reviewer for migration checklists.

Performance Audit Sub-Agents

perf-scanner.md

Scans for performance anti-patterns and bottleneck indicators.

perf-validator.md

Adversarial performance engineer for audit checklists.

For security review sub-agents, see Part 3. For remediation sub-agents, see Part 4.

10Frequently Asked Questions

Can the security review workflow pattern be applied to non-security tasks?

Yes. The workflow's power comes from five domain-independent structural principles: externalizing state into files, generating explicit checklists with completion criteria, adversarial validation, systematic execution with tracking, and self-audit for completeness. These apply to any task requiring structured analysis.

What are the core sub-agent archetypes in the generalized workflow?

Seven archetypal roles map across domains: Scanner (fast read-only exploration), Validator (adversarial gap-finding), Writer (consistent documentation), Auditor (completeness verification), Triage Specialist (classifies external tool output), Implementer (makes targeted changes), and Verifier (confirms changes work). Each domain instantiates these roles with domain-specific prompts and tool restrictions.

Do I need all seven sub-agents for every domain?

No. The scanner, validator, and auditor are essential for any review workflow. The writer adds value when findings need standardized documentation. The triage specialist is only needed when external tool output exists. The implementer and verifier are only needed when the workflow includes an action phase. Start with the core four and add as needed.

How does the test generation adaptation differ from the security workflow?

The test-writer sub-agent replaces the findings-writer as the primary output—it generates executable test code rather than documentation. The test-auditor can run the test suite and check coverage metrics, providing quantitative completeness data. The parallel branch processes coverage reports instead of SAST output. But the structural pattern—checklist, validate, scan, implement, audit—is identical.

What makes database migration review the highest-stakes adaptation?

Migration mistakes can cause irreversible data loss or production downtime. The "when to stop" conditions are stricter—the implementer should defer on almost anything beyond adding indexes or comments. Human checkpoints are critical for column drops, data type changes, and anything that acquires ACCESS EXCLUSIVE locks on large tables.

Can sub-agents from different domains be combined in a single workflow?

Yes, and this is often valuable. A comprehensive code review might dispatch security-scanner, test-scanner, api-scanner, and perf-scanner in parallel against the same codebase, each producing findings in a unified format. The orchestrator merges results into a single findings document. This is a natural extension of the parallel branch pattern.

How do I adapt this pattern to a domain not covered here?

Follow the five-step recipe in Section 8: (1) identify your domain's canonical standards checklist, (2) define what "read-only" means and enforce it via tool restrictions, (3) define your findings template, (4) enumerate "when to stop" conditions, and (5) identify your parallel branch. Then create sub-agent markdown files following the same format used throughout this series.

Where should I start if I'm new to this series?

Start with the Introduction for the failure mode taxonomy and series overview, then Part 1 for context on the three-layer security approach, and Part 2 for the foundational workflow mechanics. Even if you're not doing security review, Part 2 establishes the principles that every adaptation in this document builds on. Then return here and pick the domain closest to your needs.