How should SAST and AI review be integrated in a security workflow?

The recommended layered approach: (1) SAST runs automatically in CI/CD as a fast gate blocking known-bad patterns, (2) AI review triages SAST findings to filter false positives and confirm true positives, (3) AI conducts deep analysis for logic and authorization flaws SAST cannot detect, (4) Human reviewers approve findings and remediation approach, (5) AI-assisted remediation implements fixes, (6) Both SAST and AI verify fixes resolved the issues.

The Modern Security Review Stack: SAST, AI Analysis, and Human Oversight

Q: What is the difference between SAST and AI-based security review?

SAST (Static Application Security Testing) uses pattern matching to find known vulnerability signatures in code—it's fast, deterministic, and excellent for CI/CD pipelines. AI-based security review uses large language models to understand code context, trace data flows across boundaries, and reason about business logic vulnerabilities that pattern matching cannot detect. SAST finds 'code that looks dangerous'; AI review finds 'code that behaves dangerously in context.'

Q: Can AI security review replace SAST tools?

No, they're complementary rather than competing. SAST excels at fast, consistent detection of known patterns and integrates seamlessly into CI/CD pipelines. AI review excels at contextual analysis, business logic flaws, and authorization issues that SAST cannot detect. The optimal approach uses SAST as an automated first-pass filter, then AI review for depth and context, with human oversight for final judgment.

Q: What types of vulnerabilities can AI find that SAST tools miss?

AI-orchestrated review can detect: broken authorization and access control (IDOR), business logic bypass, insufficient tenant isolation in multi-tenant systems, subtle injection vulnerabilities where SAST loses context, cross-service data flow issues, and missing security controls. These require understanding what code should do, not just what it does—something pattern matching fundamentally cannot assess.

Q: What is the false positive rate for SAST vs AI review?

SAST tools typically have 30-70% false positive rates in enterprise codebases because they lack context—they flag patterns that look dangerous but are actually safe given surrounding code. AI review has lower false positive rates because it can understand that a 'dangerous' pattern is mitigated by input validation, allowlists, or architectural constraints. However, AI may occasionally miss edge cases that SAST's exhaustive pattern matching would catch.

Q: Why is human oversight still necessary with SAST and AI review?

Human oversight provides: accountability for security decisions, judgment on risk acceptance and business tradeoffs, verification that automated findings are actionable, approval gates before code modification, and expertise for novel vulnerability classes neither tool has seen. Humans also catch errors in both SAST rules and AI reasoning, and make final decisions on remediation priority and approach.

1The Security Review Challenge

Modern applications face a paradox: codebases are larger and more complex than ever, security threats are more sophisticated, and development cycles are faster. Manual security review doesn't scale. But automated tools miss critical vulnerability classes.

The solution isn't choosing between automation and expertise—it's layering them strategically. This guide examines three complementary approaches to security review and shows how they work together.

The series Introduction catalogs sixteen failure modes that affect agentic AI workflows for software development. The three-layer security model presented here is a direct response to those failure modes: SAST catches pattern-based issues that an AI agent might hallucinate past (phantom grounding), AI-orchestrated review addresses the silent corruption modes—goal substitution, invisible assumptions, completeness gaps—that pattern matching fundamentally cannot detect, and human oversight provides the accountability layer that counters automation complacency, the tendency to rubber-stamp agent output after repeated positive experiences.

Key Insight

No single approach catches everything. SAST tools miss business logic flaws. AI review may miss edge cases in complex control flow. Human reviewers can't examine every line of code. The goal isn't finding a silver bullet—it's building a stack where each layer catches what the others miss.

2The Three Layers

⚡

Static Application Security Testing (SAST)

Pattern matching at machine speed

SAST tools analyze source code without executing it, searching for patterns that match known vulnerability signatures. They're fast, deterministic, and integrate seamlessly into CI/CD pipelines.

✓ Scans millions of lines in minutes

✓ Consistent, reproducible results

✓ CI/CD gate automation

✗ High false positive rates (30-70%)

✗ Cannot detect business logic flaws

🤖

AI-Orchestrated Review

Context-aware analysis with reasoning

Large language models analyze code with contextual understanding, tracing data flows across architectural boundaries and reasoning about authorization logic. They produce human-readable explanations of vulnerabilities.

✓ Understands code context and intent

✓ Detects authorization and logic flaws

✓ Explains findings with attack scenarios

~ Slower than pattern matching

~ May miss complex edge cases

👤

Human Oversight

Judgment, accountability, and approval

Security experts provide judgment on risk acceptance, validate automated findings, approve remediation approaches, and handle novel vulnerability classes that tools haven't encountered.

✓ Final accountability for decisions

✓ Business context and risk tradeoffs

✓ Novel vulnerability recognition

✗ Doesn't scale to full codebase review

✗ Expensive and time-consuming

3SAST: What It Does Well

Static Application Security Testing tools—Semgrep, SonarQube, CodeQL, Checkmarx, Fortify, Snyk Code—have been the backbone of automated security for over a decade. They excel in specific scenarios:

Pattern Detection at Scale

SAST tools maintain extensive rule libraries for known vulnerability patterns:

# Semgrep rule example: SQL injection detection
rules:
  - id: sql-injection
    patterns:
      - pattern: execute($QUERY)
      - pattern-not: execute($QUERY, $PARAMS)
    message: "Potential SQL injection: use parameterized queries"
    severity: ERROR

These rules catch obvious mistakes instantly. A developer writes execute(f"SELECT * FROM users WHERE id = {user_id}"), and the tool flags it before the code leaves their machine.

CI/CD Integration

SAST tools produce deterministic results—same code, same findings, every time. This makes them ideal for automated gates:

# GitHub Actions example
security-scan:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v3
    - name: Run Semgrep
      uses: returntocorp/semgrep-action@v1
      with:
        config: p/security-audit
    - name: Block on high severity
      if: steps.semgrep.outputs.findings != ''
      run: exit 1

Where SAST Falls Short

SAST tools fundamentally operate through pattern matching. They recognize "code that looks dangerous" but cannot assess "code that behaves dangerously in context."

Context Blindness

# SAST flags this as SQL injection
table_name = get_table_from_allowlist(request.table)  # Returns "users" or "orders"
query = f"SELECT * FROM {table_name} WHERE id = ?"
cursor.execute(query, [user_id])

# But it's actually SAFE because:
# 1. table_name comes from a hardcoded allowlist
# 2. The WHERE clause uses parameterized query
# SAST can't trace the allowlist constraint

Authorization Logic

# SAST sees nothing wrong here
@app.get("/api/documents/{doc_id}")
def get_document(doc_id: int, user: User = Depends(get_current_user)):
    return db.query(Document).filter(Document.id == doc_id).first()

# But there's a critical flaw:
# No check that user is authorized to access this specific document
# This is an IDOR vulnerability—SAST cannot detect it

Cross-Boundary Data Flow

When user input travels through message queues, microservice calls, or database round-trips, SAST tools lose track. A malicious payload stored in a database and retrieved by another service won't be flagged as tainted.

4AI Review: Filling the Gaps

AI-orchestrated security review uses large language models to analyze code with contextual understanding. Rather than matching patterns, the AI reasons about what code does and whether that behavior is secure.

Contextual Understanding

Where SAST sees a dangerous pattern, AI review sees the full context:

# AI Review Analysis:
# 
# Examining: query = f"SELECT * FROM {table_name} WHERE id = ?"
#
# Tracing table_name backward:
# - Line 12: table_name = get_table_from_allowlist(request.table)
# - get_table_from_allowlist() returns only "users" or "orders" (line 45-48)
# - User-controlled input (request.table) is validated against allowlist
# - SQL injection not possible for table name
# 
# Checking parameterization:
# - WHERE clause uses ? placeholder with separate params
# - user_id passed as parameter, not interpolated
#
# Finding: FALSE POSITIVE - table name constrained by allowlist,
# value parameterized. No SQL injection vulnerability.

Business Logic Analysis

AI review can detect vulnerabilities that require understanding intent:

# AI Review Analysis:
#
# Examining: GET /api/documents/{doc_id}
#
# Authorization check present: ✓ (get_current_user dependency)
# User authenticated: ✓
# 
# Access control check: ✗ MISSING
# - Function retrieves document by ID only
# - No verification that requesting user owns/can access document
# - Any authenticated user can access any document by ID
#
# Finding: HIGH - Insecure Direct Object Reference (IDOR)
# Attack scenario: Authenticated user changes doc_id parameter
# to access other users' documents.
#
# Recommended fix: Add ownership check
#   document = db.query(Document).filter(
#       Document.id == doc_id,
#       Document.owner_id == user.id  # ADD THIS
#   ).first()

Adaptive Methodology

AI review generates checklists tailored to the specific technology stack and threat model. A review of a multi-tenant SaaS application handling healthcare data will include different checks than a single-tenant internal tool:

# Generated checklist items for multi-tenant healthcare SaaS:

TENANT-01: Verify tenant ID in JWT claims validated on every request
TENANT-02: Check all database queries include tenant filter
TENANT-03: Review admin cross-tenant access controls
PHI-01: Verify PHI fields excluded from logs
PHI-02: Check encryption at rest for PHI columns
PHI-03: Review PHI access audit trail implementation
...

Where AI Review Has Limitations

Speed — AI analysis is slower than pattern matching; not suitable for every commit in high-velocity repos
Cost — LLM inference costs more than running static rules
Consistency — Results may vary slightly between runs (though structured workflows mitigate this)
Complex Control Flow — Deeply nested conditionals or unusual patterns may confuse analysis
Dependencies — Cannot scan vulnerability databases for known CVEs (that's SCA tooling)

5Comparison Matrix

This matrix shows which approach is strongest for each vulnerability class:

Vulnerability Class	SAST	AI Review	Recommended
SQL Injection (obvious)	✓ Strong	✓ Strong	SAST first (faster)
SQL Injection (subtle/contextual)	~ Limited	✓ Strong	AI primary
XSS (reflected)	✓ Strong	✓ Strong	SAST first
XSS (stored, cross-service)	✗ Loses track	✓ Traces flow	AI primary
Broken Authorization / IDOR	✗ Cannot detect	✓ Core strength	AI only
Business Logic Bypass	✗ Cannot detect	✓ Reasons about intent	AI only
Tenant Isolation Flaws	✗ Cannot detect	✓ Checks patterns	AI only
Hardcoded Secrets	✓ Strong	✓ Strong	SAST first (faster)
Dependency CVEs	✓ SCA tools	✗ Not designed for this	SAST/SCA only
Cryptographic Weakness	✓ Pattern matching	✓ Context evaluation	Both
Race Conditions	~ Limited	~ Can reason but may miss	Both + dynamic testing
Input Validation Gaps	~ Pattern-based	✓ Traces to usage	AI primary

The Pattern

SAST excels when the vulnerable code looks dangerous (dangerous function calls, missing sanitization at the call site). AI excels when the vulnerability is about missing code (no authorization check, no tenant filter) or requires understanding what the code should do versus what it does.

6The Integrated Workflow

The optimal approach layers these tools strategically, using each where it's strongest:

SAST • Automated Gate

Block known-bad patterns in CI/CD — Fast, cheap, catches obvious issues before code merges

AI Review • Triage

Filter SAST false positives — AI confirms true positives and dismisses patterns that are safe in context

AI Review • Deep Analysis

Find what SAST cannot — Authorization flaws, business logic, tenant isolation, cross-boundary flows

Human • Approval

Review findings and approve remediation — Expert judgment on severity, risk acceptance, fix approach

AI Review • Remediation

Implement and verify fixes — Targeted code changes with test verification

SAST + AI • Verification

Confirm issues resolved — Re-run scans to verify vulnerabilities are fixed

Cost-Benefit Optimization

Layer	When to Run	Cost	Value
SAST	Every commit, every PR	Low (automated)	Catches 60-70% of pattern-based vulns
AI Triage	When SAST has findings	Medium	Reduces false positive noise by 50-80%
AI Deep Review	Major releases, sensitive changes	Higher	Finds logic flaws SAST misses entirely
Human Review	High-risk changes, final approval	Highest	Accountability, novel vulns, judgment

7This Series: AI-Orchestrated Security Review

The series Introduction established a taxonomy of sixteen failure modes that affect agentic AI workflows and five structural principles that address them. The remaining parts of this series build the AI layer—structured, repeatable workflows for AI-assisted security review using Claude Code. Each part builds on the previous:

Part 2

Self-Scaffolding Workflows

Why "act as a security expert" prompts fail, and how to build workflows with checklists, execution plans, and runbooks that actually work.

Part 3

Subagent Orchestration

Specialized AI agents for scanning, validation, documentation, and auditing—with enforced constraints and isolated contexts.

Part 4

Automated Remediation

From findings to fixed code—implementing and verifying vulnerability fixes with human oversight checkpoints.

Part 5

Demonstration

Complete walkthrough on a real application—SAST scanning, AI triage, deep review, and remediation with actual findings.

Part 6

Beyond Security

Adapt the workflow to any domain—automated testing, API review, database migrations, and performance auditing with downloadable sub-agents.

Getting Started

If you're new to this series, start with the Introduction for the full failure mode taxonomy and structural principles. If you already use SAST tools, continue to Part 2 to add AI review for the vulnerability classes SAST misses. If you're building security review from scratch, this article provides context for why the AI workflow is designed the way it is.

8Frequently Asked Questions

What is the difference between SAST and AI-based security review?

SAST uses pattern matching to find known vulnerability signatures—it's fast, deterministic, and excellent for CI/CD. AI review uses large language models to understand code context, trace data flows across boundaries, and reason about business logic. SAST finds "code that looks dangerous"; AI review finds "code that behaves dangerously in context."

Can AI security review replace SAST tools?

No, they're complementary. SAST excels at fast, consistent detection of known patterns in CI/CD pipelines. AI excels at contextual analysis and logic flaws SAST cannot detect. The optimal approach uses SAST as an automated first-pass, then AI for depth and context, with human oversight for final judgment.

What types of vulnerabilities can AI find that SAST tools miss?

AI review detects: broken authorization and access control (IDOR), business logic bypass, tenant isolation failures, subtle injection where SAST loses context, cross-service data flow issues, and missing security controls. These require understanding intent, not just pattern matching.

What is the false positive rate for SAST vs AI review?

SAST typically has 30-70% false positive rates because it lacks context. AI review has lower false positive rates because it understands when "dangerous" patterns are mitigated. However, AI may occasionally miss edge cases that SAST's exhaustive pattern matching would catch—hence using both.

Why is human oversight still necessary?

Humans provide: accountability for security decisions, judgment on risk acceptance and business tradeoffs, verification that findings are actionable, approval gates before code modification, and expertise for novel vulnerability classes. Both SAST and AI can be wrong—humans catch those errors.

How should SAST and AI review be integrated?

Layer them: (1) SAST runs in CI/CD as a fast gate, (2) AI triages SAST findings to filter false positives, (3) AI conducts deep analysis for logic flaws, (4) humans approve findings and approach, (5) AI implements fixes, (6) both verify fixes work. Each layer catches what others miss.