Capstone IT Engineering Series — Part 1
Introduction Part 1: Security Review Landscape Part 2 Part 3 Part 4 Part 5 Part 6

The Modern Security Review Stack

How static analysis, AI-orchestrated review, and human oversight work together—and why you need all three.

1The Security Review Challenge

Modern applications face a paradox: codebases are larger and more complex than ever, security threats are more sophisticated, and development cycles are faster. Manual security review doesn't scale. But automated tools miss critical vulnerability classes.

The solution isn't choosing between automation and expertise—it's layering them strategically. This guide examines three complementary approaches to security review and shows how they work together.

The series Introduction catalogs sixteen failure modes that affect agentic AI workflows for software development. The three-layer security model presented here is a direct response to those failure modes: SAST catches pattern-based issues that an AI agent might hallucinate past (phantom grounding), AI-orchestrated review addresses the silent corruption modes—goal substitution, invisible assumptions, completeness gaps—that pattern matching fundamentally cannot detect, and human oversight provides the accountability layer that counters automation complacency, the tendency to rubber-stamp agent output after repeated positive experiences.

Key Insight

No single approach catches everything. SAST tools miss business logic flaws. AI review may miss edge cases in complex control flow. Human reviewers can't examine every line of code. The goal isn't finding a silver bullet—it's building a stack where each layer catches what the others miss.

2The Three Layers

Static Application Security Testing (SAST)
Pattern matching at machine speed

SAST tools analyze source code without executing it, searching for patterns that match known vulnerability signatures. They're fast, deterministic, and integrate seamlessly into CI/CD pipelines.

Scans millions of lines in minutes
Consistent, reproducible results
CI/CD gate automation
High false positive rates (30-70%)
Cannot detect business logic flaws
🤖
AI-Orchestrated Review
Context-aware analysis with reasoning

Large language models analyze code with contextual understanding, tracing data flows across architectural boundaries and reasoning about authorization logic. They produce human-readable explanations of vulnerabilities.

Understands code context and intent
Detects authorization and logic flaws
Explains findings with attack scenarios
~ Slower than pattern matching
~ May miss complex edge cases
👤
Human Oversight
Judgment, accountability, and approval

Security experts provide judgment on risk acceptance, validate automated findings, approve remediation approaches, and handle novel vulnerability classes that tools haven't encountered.

Final accountability for decisions
Business context and risk tradeoffs
Novel vulnerability recognition
Doesn't scale to full codebase review
Expensive and time-consuming

3SAST: What It Does Well

Static Application Security Testing tools—Semgrep, SonarQube, CodeQL, Checkmarx, Fortify, Snyk Code—have been the backbone of automated security for over a decade. They excel in specific scenarios:

Pattern Detection at Scale

SAST tools maintain extensive rule libraries for known vulnerability patterns:

# Semgrep rule example: SQL injection detection
rules:
  - id: sql-injection
    patterns:
      - pattern: execute($QUERY)
      - pattern-not: execute($QUERY, $PARAMS)
    message: "Potential SQL injection: use parameterized queries"
    severity: ERROR

These rules catch obvious mistakes instantly. A developer writes execute(f"SELECT * FROM users WHERE id = {user_id}"), and the tool flags it before the code leaves their machine.

CI/CD Integration

SAST tools produce deterministic results—same code, same findings, every time. This makes them ideal for automated gates:

# GitHub Actions example
security-scan:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v3
    - name: Run Semgrep
      uses: returntocorp/semgrep-action@v1
      with:
        config: p/security-audit
    - name: Block on high severity
      if: steps.semgrep.outputs.findings != ''
      run: exit 1

Where SAST Falls Short

SAST tools fundamentally operate through pattern matching. They recognize "code that looks dangerous" but cannot assess "code that behaves dangerously in context."

Context Blindness

# SAST flags this as SQL injection
table_name = get_table_from_allowlist(request.table)  # Returns "users" or "orders"
query = f"SELECT * FROM {table_name} WHERE id = ?"
cursor.execute(query, [user_id])

# But it's actually SAFE because:
# 1. table_name comes from a hardcoded allowlist
# 2. The WHERE clause uses parameterized query
# SAST can't trace the allowlist constraint

Authorization Logic

# SAST sees nothing wrong here
@app.get("/api/documents/{doc_id}")
def get_document(doc_id: int, user: User = Depends(get_current_user)):
    return db.query(Document).filter(Document.id == doc_id).first()

# But there's a critical flaw:
# No check that user is authorized to access this specific document
# This is an IDOR vulnerability—SAST cannot detect it

Cross-Boundary Data Flow

When user input travels through message queues, microservice calls, or database round-trips, SAST tools lose track. A malicious payload stored in a database and retrieved by another service won't be flagged as tainted.

4AI Review: Filling the Gaps

AI-orchestrated security review uses large language models to analyze code with contextual understanding. Rather than matching patterns, the AI reasons about what code does and whether that behavior is secure.

Contextual Understanding

Where SAST sees a dangerous pattern, AI review sees the full context:

# AI Review Analysis:
# 
# Examining: query = f"SELECT * FROM {table_name} WHERE id = ?"
#
# Tracing table_name backward:
# - Line 12: table_name = get_table_from_allowlist(request.table)
# - get_table_from_allowlist() returns only "users" or "orders" (line 45-48)
# - User-controlled input (request.table) is validated against allowlist
# - SQL injection not possible for table name
# 
# Checking parameterization:
# - WHERE clause uses ? placeholder with separate params
# - user_id passed as parameter, not interpolated
#
# Finding: FALSE POSITIVE - table name constrained by allowlist,
# value parameterized. No SQL injection vulnerability.

Business Logic Analysis

AI review can detect vulnerabilities that require understanding intent:

# AI Review Analysis:
#
# Examining: GET /api/documents/{doc_id}
#
# Authorization check present: ✓ (get_current_user dependency)
# User authenticated: ✓
# 
# Access control check: ✗ MISSING
# - Function retrieves document by ID only
# - No verification that requesting user owns/can access document
# - Any authenticated user can access any document by ID
#
# Finding: HIGH - Insecure Direct Object Reference (IDOR)
# Attack scenario: Authenticated user changes doc_id parameter
# to access other users' documents.
#
# Recommended fix: Add ownership check
#   document = db.query(Document).filter(
#       Document.id == doc_id,
#       Document.owner_id == user.id  # ADD THIS
#   ).first()

Adaptive Methodology

AI review generates checklists tailored to the specific technology stack and threat model. A review of a multi-tenant SaaS application handling healthcare data will include different checks than a single-tenant internal tool:

# Generated checklist items for multi-tenant healthcare SaaS:

TENANT-01: Verify tenant ID in JWT claims validated on every request
TENANT-02: Check all database queries include tenant filter
TENANT-03: Review admin cross-tenant access controls
PHI-01: Verify PHI fields excluded from logs
PHI-02: Check encryption at rest for PHI columns
PHI-03: Review PHI access audit trail implementation
...

Where AI Review Has Limitations

5Comparison Matrix

This matrix shows which approach is strongest for each vulnerability class:

Vulnerability Class SAST AI Review Recommended
SQL Injection (obvious) ✓ Strong ✓ Strong SAST first (faster)
SQL Injection (subtle/contextual) ~ Limited ✓ Strong AI primary
XSS (reflected) ✓ Strong ✓ Strong SAST first
XSS (stored, cross-service) ✗ Loses track ✓ Traces flow AI primary
Broken Authorization / IDOR ✗ Cannot detect ✓ Core strength AI only
Business Logic Bypass ✗ Cannot detect ✓ Reasons about intent AI only
Tenant Isolation Flaws ✗ Cannot detect ✓ Checks patterns AI only
Hardcoded Secrets ✓ Strong ✓ Strong SAST first (faster)
Dependency CVEs ✓ SCA tools ✗ Not designed for this SAST/SCA only
Cryptographic Weakness ✓ Pattern matching ✓ Context evaluation Both
Race Conditions ~ Limited ~ Can reason but may miss Both + dynamic testing
Input Validation Gaps ~ Pattern-based ✓ Traces to usage AI primary
The Pattern

SAST excels when the vulnerable code looks dangerous (dangerous function calls, missing sanitization at the call site). AI excels when the vulnerability is about missing code (no authorization check, no tenant filter) or requires understanding what the code should do versus what it does.

6The Integrated Workflow

The optimal approach layers these tools strategically, using each where it's strongest:

1
SAST • Automated Gate
Block known-bad patterns in CI/CD — Fast, cheap, catches obvious issues before code merges
2
AI Review • Triage
Filter SAST false positives — AI confirms true positives and dismisses patterns that are safe in context
3
AI Review • Deep Analysis
Find what SAST cannot — Authorization flaws, business logic, tenant isolation, cross-boundary flows
4
Human • Approval
Review findings and approve remediation — Expert judgment on severity, risk acceptance, fix approach
5
AI Review • Remediation
Implement and verify fixes — Targeted code changes with test verification
6
SAST + AI • Verification
Confirm issues resolved — Re-run scans to verify vulnerabilities are fixed

Cost-Benefit Optimization

Layer When to Run Cost Value
SAST Every commit, every PR Low (automated) Catches 60-70% of pattern-based vulns
AI Triage When SAST has findings Medium Reduces false positive noise by 50-80%
AI Deep Review Major releases, sensitive changes Higher Finds logic flaws SAST misses entirely
Human Review High-risk changes, final approval Highest Accountability, novel vulns, judgment

7This Series: AI-Orchestrated Security Review

The series Introduction established a taxonomy of sixteen failure modes that affect agentic AI workflows and five structural principles that address them. The remaining parts of this series build the AI layer—structured, repeatable workflows for AI-assisted security review using Claude Code. Each part builds on the previous:

Getting Started

If you're new to this series, start with the Introduction for the full failure mode taxonomy and structural principles. If you already use SAST tools, continue to Part 2 to add AI review for the vulnerability classes SAST misses. If you're building security review from scratch, this article provides context for why the AI workflow is designed the way it is.

8Frequently Asked Questions

What is the difference between SAST and AI-based security review?

SAST uses pattern matching to find known vulnerability signatures—it's fast, deterministic, and excellent for CI/CD. AI review uses large language models to understand code context, trace data flows across boundaries, and reason about business logic. SAST finds "code that looks dangerous"; AI review finds "code that behaves dangerously in context."

Can AI security review replace SAST tools?

No, they're complementary. SAST excels at fast, consistent detection of known patterns in CI/CD pipelines. AI excels at contextual analysis and logic flaws SAST cannot detect. The optimal approach uses SAST as an automated first-pass, then AI for depth and context, with human oversight for final judgment.

What types of vulnerabilities can AI find that SAST tools miss?

AI review detects: broken authorization and access control (IDOR), business logic bypass, tenant isolation failures, subtle injection where SAST loses context, cross-service data flow issues, and missing security controls. These require understanding intent, not just pattern matching.

What is the false positive rate for SAST vs AI review?

SAST typically has 30-70% false positive rates because it lacks context. AI review has lower false positive rates because it understands when "dangerous" patterns are mitigated. However, AI may occasionally miss edge cases that SAST's exhaustive pattern matching would catch—hence using both.

Why is human oversight still necessary?

Humans provide: accountability for security decisions, judgment on risk acceptance and business tradeoffs, verification that findings are actionable, approval gates before code modification, and expertise for novel vulnerability classes. Both SAST and AI can be wrong—humans catch those errors.

How should SAST and AI review be integrated?

Layer them: (1) SAST runs in CI/CD as a fast gate, (2) AI triages SAST findings to filter false positives, (3) AI conducts deep analysis for logic flaws, (4) humans approve findings and approach, (5) AI implements fixes, (6) both verify fixes work. Each layer catches what others miss.