1Demonstration Overview
This article puts the entire series into practice. We'll run the complete security review workflow on a real open-source application, showing every step from SAST scanning through AI triage, deep review, and remediation.
The series Introduction cataloged sixteen failure modes that affect agentic workflows—and this demonstration is where you can see the structural defenses against them in action. Verbatim evidence requirements catch phantom grounding before fabricated findings reach the report. The checklist-validator-auditor triple layer catches completeness gaps that a single-pass review would miss. Checkpoint verification between phases interrupts error compounding before a small upstream mistake corrupts everything downstream. Where the taxonomy describes the failure modes in the abstract, this walkthrough shows what they look like—and what catching them looks like—in practice.
The Workflow
The workflow has two parallel branches that merge at findings.md:
Key insight: The checklist is a reusable methodology template. SAST findings don't feed into the checklist—both branches produce findings that merge in findings.md.
Target Application
| Attribute | Details |
|---|---|
| Application | RealWorld "Conduit" — Medium.com clone |
| Repository | 123MwanjeMike/node-express-realworld-example-app (fork preserving original stack) |
| Stack | Node.js, Express 4.13.4, MongoDB/Mongoose 4.4.10, JWT Authentication |
| Size | ~3,000 lines of JavaScript across 13 files |
| Features | User auth, articles, comments, favorites, follows, tags |
The original gothinkster/node-express-realworld-example-app repository has been updated to TypeScript/Prisma. This fork preserves the original JavaScript/MongoDB stack documented in this demonstration.
Tools
| Tool | Purpose |
|---|---|
| Semgrep | Static Application Security Testing (SAST) |
| Claude Code | AI orchestration with subagents |
| Subagents | sast-triage, security-scanner, checklist-validator, findings-writer, self-auditor, fix-implementer, fix-verifier |
2Environment Setup
Install Semgrep
# Install Semgrep
pip install semgrep
# Verify installation
semgrep --version
Actual result: Semgrep 1.149.0 installed successfully.
Clone Target Repository
The original gothinkster/node-express-realworld-example-app has been updated to TypeScript/Prisma. Use this fork that preserves the original JavaScript/MongoDB stack:
# Clone the RealWorld Express implementation (original stack fork)
git clone https://github.com/123MwanjeMike/node-express-realworld-example-app.git target-app
cd target-app
# Verify codebase structure
find . -name "*.js" -not -path "./node_modules/*" | xargs wc -l
Actual result: 13 JavaScript files cloned. Structure verified:
app.js— Main application entryconfig/— Configuration including JWT secretmodels/— Mongoose models (User, Article, Comment)routes/api/— Express route handlers (users, articles, profiles, tags)
Verify Subagents
# Subagents should be installed from Part 3
ls ~/.claude/agents/
# Required subagents:
# - sast-triage.md
# - security-scanner.md
# - checklist-validator.md
# - findings-writer.md
# - self-auditor.md
# - fix-implementer.md (from Part 4)
# - fix-verifier.md (from Part 4)
Create Working Directory
# Create directory for review artifacts
mkdir security-review && cd security-review
# Files we'll generate:
# - sast-results.json
# - sast-triage-report.md
# - checklist.md
# - plan.md
# - findings.md
# - remediation-log.md
This demonstration focuses on code-level vulnerabilities—patterns in source code that create security risks. Dependency vulnerabilities (outdated packages with known CVEs) are a separate concern typically handled by npm audit, Snyk, or Dependabot in CI/CD pipelines. Semgrep is a code pattern scanner, not a dependency scanner.
3Seed Test Vulnerabilities
To ensure a meaningful demonstration with predictable results, we inject specific vulnerabilities into the codebase. This gives us guaranteed findings across different categories: SAST-detectable patterns, AI-only logic flaws, and false positive triggers.
Real codebases have unpredictable security postures. Seeding ensures: (1) guaranteed findings to demonstrate the workflow, (2) balanced comparison between SAST and AI capabilities, (3) reproducibility for anyone following along, and (4) educational examples of specific vulnerability patterns.
Category 1: SAST-Detectable (Pattern-Based)
These vulnerabilities follow patterns that Semgrep rules can match:
SEED-01: NoSQL Injection
File: routes/api/users.js
// SEED-01: NoSQL injection via $where operator
// Add this route after the existing user routes
router.get('/search', auth.optional, function(req, res, next) {
var query = req.query.q;
// VULNERABLE: User input directly in $where clause
User.find({ $where: "this.username.includes('" + query + "')" })
.then(function(users) {
return res.json({ users: users.map(u => u.toProfileJSONFor()) });
}).catch(next);
});
SEED-02: Hardcoded Secret
File: config/index.js
// SEED-02: Hardcoded JWT secret (modify existing secret line)
// Change from: secret: process.env.SECRET || 'secret'
// Change to:
module.exports = {
secret: process.env.SECRET || 'super_secret_jwt_key_12345',
// ... rest of config
};
SEED-03: Dangerous eval()
File: routes/api/articles.js
// SEED-03: eval() with user input
// Add this route for "advanced filtering"
router.get('/filter', auth.optional, function(req, res, next) {
var filterExpr = req.query.expr;
// VULNERABLE: eval with user-controlled input
var filterFn = eval('(function(article) { return ' + filterExpr + '; })');
Article.find({}).then(function(articles) {
var filtered = articles.filter(filterFn);
return res.json({ articles: filtered });
}).catch(next);
});
SEED-04: Regex DoS
File: models/User.js
// SEED-04: Catastrophic backtracking regex
// Add this method to UserSchema
UserSchema.methods.validateBio = function() {
// VULNERABLE: Regex with catastrophic backtracking
var bioPattern = /^([a-zA-Z]+)*$/;
return bioPattern.test(this.bio);
};
Category 2: AI-Only (Requires Understanding Intent)
These vulnerabilities require understanding what the code should do, not just pattern matching:
SEED-05: IDOR on Article Update
File: routes/api/articles.js
// SEED-05: Missing ownership check on article update
// Replace the existing PUT /articles/:slug handler with:
router.put('/:article', auth.required, function(req, res, next) {
// VULNERABLE: No check that req.auth.id === article.author
// Any authenticated user can update any article
if(typeof req.body.article.title !== 'undefined'){
req.article.title = req.body.article.title;
}
if(typeof req.body.article.description !== 'undefined'){
req.article.description = req.body.article.description;
}
if(typeof req.body.article.body !== 'undefined'){
req.article.body = req.body.article.body;
}
req.article.save().then(function(article){
return res.json({article: article.toJSONFor(req.auth)});
}).catch(next);
});
SEED-06: IDOR on Comment Deletion
File: routes/api/articles.js
// SEED-06: Missing ownership check on comment deletion
// Replace the existing DELETE /articles/:slug/comments/:comment handler:
router.delete('/:article/comments/:comment', auth.required, function(req, res, next) {
// VULNERABLE: No check that req.auth.id === comment.author
// Any authenticated user can delete any comment
req.article.comments.remove(req.comment._id);
req.article.save()
.then(function() {
Comment.find({_id: req.comment._id}).remove().exec();
res.sendStatus(204);
}).catch(next);
});
SEED-07: Mass Assignment on User Profile
File: routes/api/users.js
// SEED-07: Mass assignment vulnerability
// Replace the existing PUT /user handler with:
router.put('/user', auth.required, function(req, res, next) {
User.findById(req.auth.id).then(function(user) {
if(!user){ return res.sendStatus(401); }
// VULNERABLE: Spreads all user-provided fields without filtering
// Attacker can set admin: true, role: 'admin', etc.
Object.assign(user, req.body.user);
return user.save().then(function() {
return res.json({user: user.toAuthJSON()});
});
}).catch(next);
});
SEED-08: Favorite Count Manipulation
File: routes/api/articles.js
// SEED-08: Direct manipulation of computed field
// Add this route:
router.put('/:article/favoritesCount', auth.required, function(req, res, next) {
// VULNERABLE: Allows direct setting of favoritesCount
// This should be a computed value, not user-settable
req.article.favoritesCount = req.body.count;
req.article.save().then(function(article) {
return res.json({article: article.toJSONFor(req.auth)});
}).catch(next);
});
Category 3: False Positive Triggers
These patterns look dangerous to SAST but are safe in context—testing AI triage accuracy:
SEED-09: SQL-like Syntax in Logging (Safe)
File: routes/api/articles.js
// SEED-09: SQL-like syntax that's actually just a log message
// Add to the GET /articles handler:
router.get('/', auth.optional, function(req, res, next) {
// This is just a log message, not executed as SQL
console.log("Query: SELECT * FROM articles WHERE author = '" +
(req.query.author || 'anonymous') + "' -- for debugging");
// ... rest of existing handler
});
SEED-10: Validated Input Used in Dangerous Pattern (Safe)
File: routes/api/tags.js
// SEED-10: Looks dangerous but input is validated
// Add this route:
router.get('/search', function(req, res, next) {
var tag = req.query.tag;
// Input is strictly validated - alphanumeric only, max 20 chars
if (!/^[a-zA-Z0-9]{1,20}$/.test(tag)) {
return res.status(400).json({ error: 'Invalid tag format' });
}
// SAST may flag this, but it's safe due to validation above
var searchPattern = new RegExp(tag, 'i');
Article.find({ tagList: searchPattern }).then(function(articles) {
return res.json({ articles: articles });
}).catch(next);
});
Apply the Seeds
# Create a patch file with all seed vulnerabilities
cat > seed-vulnerabilities.patch << 'EOF'
[PLACEHOLDER: Generated patch file from the above changes]
EOF
# Apply the patch
cd node-express-realworld-example-app
git apply ../seed-vulnerabilities.patch
# Commit the seeded version for reproducibility
git add -A
git commit -m "DEMO: Seed vulnerabilities for security review demonstration"
| ID | Vulnerability | Expected SAST | Actual SAST | AI Found? |
|---|---|---|---|---|
| SEED-01 | NoSQL Injection ($where) | ✓ | ✗ Missed | ✓ Yes |
| SEED-02 | Hardcoded JWT Secret | ✓ | ✗ Missed | ✓ Yes |
| SEED-03 | eval() with User Input | ✓ | ✓ Found | ✓ Yes |
| SEED-04 | Regex DoS | ✓ | ✗ Missed | ✗ Missed* |
| SEED-05 | IDOR: Article Update | ✗ | — | ✓ Yes |
| SEED-06 | IDOR: Comment Delete | ✗ | — | ✓ Yes |
| SEED-07 | Mass Assignment | ✗ | — | ✓ Yes |
| SEED-08 | Favorite Count Manipulation | ✗ | — | ✓ Yes |
| SEED-09 | SQL-like Logging (Safe) | ✓ FP | ✗ Not flagged | N/A |
| SEED-10 | Validated Regex (Safe) | ✓ FP | ✗ Not flagged | N/A |
Actual outcome: SAST found only 1 of 8 seeded vulnerabilities (12.5%). Combined workflow found 7 of 8 (87.5%). *ReDoS requires specialized tooling neither SAST nor general AI provides.
4SAST Baseline Scan
With vulnerabilities seeded, we establish a baseline of pattern-detected issues using Semgrep.
Running the Scan
# Run Semgrep with security-focused rulesets
# Note: p/owasp ruleset was unavailable at time of testing
semgrep scan \
--config p/security-audit \
--config p/nodejs \
--json \
--output security-review/sast-results.json \
target-app
# Human-readable summary
semgrep scan \
--config p/security-audit \
--config p/nodejs \
target-app
SAST Results
SAST Findings Detail
| # | Rule ID | Severity | File | Description |
|---|---|---|---|---|
| 1 | express-session-hardcoded-secret | WARNING | app.js:27 | Hard-coded session secret 'conduit' |
| 2 | express-cookie-session-default-name | WARNING | app.js:27 | Default session cookie name |
| 3 | express-cookie-session-no-domain | WARNING | app.js:27 | Cookie domain not set |
| 4 | express-cookie-session-no-expires | WARNING | app.js:27 | Cookie expires not set |
| 5 | express-cookie-session-no-httponly | WARNING | app.js:27 | Cookie httpOnly not set |
| 6 | express-cookie-session-no-path | WARNING | app.js:27 | Cookie path not set |
| 7 | express-cookie-session-no-secure | WARNING | app.js:27 | Cookie secure not set |
| 8 | code-string-concat | ERROR | articles.js:35 | eval() with user input (SEED-03) ✓ |
| 9 | express-jwt-not-revoked | WARNING | auth.js:14 | JWT token revoking not configured |
| 10 | express-jwt-not-revoked | WARNING | auth.js:19 | JWT token revoking not configured |
Analysis: Seeded vs Detected
| Seeded Vulnerability | Expected | Actual Result |
|---|---|---|
| SEED-01: NoSQL $where injection | Should detect | NOT DETECTED |
| SEED-02: Hardcoded JWT secret (config) | Should detect | NOT DETECTED (different secret found) |
| SEED-03: eval() with user input | Should detect | DETECTED ✓ |
| SEED-04: Regex DoS | Should detect | NOT DETECTED |
| SEED-09: SQL-like logging (safe) | Should NOT detect | Not detected ✓ |
| SEED-10: Validated regex (safe) | Should NOT detect | Not detected ✓ |
Semgrep detected only 1 of 4 pattern-based vulnerabilities we seeded. The NoSQL injection ($where), hardcoded secret in config, and ReDoS were all missed despite being "pattern-based" issues. This highlights that even SAST tools have gaps—they don't catch everything. The 9 other findings are pre-existing issues in the original codebase (session configuration, JWT settings).
5AI Triage of SAST Findings
Now we use the sast-triage subagent to classify each finding by analyzing context that Semgrep couldn't see.
Triage Prompt
Use the sast-triage subagent to analyze the Semgrep findings in
sast-results.json.
For each finding:
1. Locate the flagged code and read 30+ lines of context
2. Trace data flow backward to the source
3. Trace data flow forward to the sink
4. Evaluate mitigations (framework protections, validation, architecture)
5. Classify as:
- TRUE_POSITIVE: Exploitable vulnerability
- FALSE_POSITIVE: Safe in context (document why)
- NEEDS_INVESTIGATION: Cannot determine statically
Write detailed triage results to sast-triage-report.md
Triage Results
Full report: security-review/sast-triage-report.md
Triage Summary by Priority
| Priority | Finding | Classification | Action |
|---|---|---|---|
| CRITICAL | #8 code-string-concat (eval RCE) | TRUE_POSITIVE | Remove endpoint immediately |
| CRITICAL | #1 hardcoded-secret ('conduit') | TRUE_POSITIVE | Use environment variable |
| HIGH | #7 no-secure flag | TRUE_POSITIVE | Add secure: true for production |
| HIGH | #9, #10 jwt-not-revoked | TRUE_POSITIVE | Implement revocation mechanism |
| MEDIUM | #2 default-cookie-name | TRUE_POSITIVE | Use custom session name |
| MEDIUM | #4 session-timeout (60s) | TRUE_POSITIVE | Increase to reasonable value |
| LOW | #5 no-httponly explicit | TRUE_POSITIVE | Add explicit config |
| CONTEXTUAL | #3 no-domain | NEEDS_INVESTIGATION | Depends on deployment |
| CONTEXTUAL | #6 no-path | NEEDS_INVESTIGATION | Default `/` usually acceptable |
Example: True Positive (Critical)
// routes/api/articles.js:35
var filterExpr = req.query.expr; // User-controlled input
var filterFn = eval('(function(article) { return ' + filterExpr + '; })');
Source Analysis: User input from req.query.expr flows directly into the code.
Sink Analysis: eval() executes arbitrary JavaScript on the server.
Attack Scenario:
GET /api/articles/filter?expr=process.exit(1))%3B//
This crashes the server. More dangerous payloads could execute system commands, read files, or establish reverse shells.
Mitigations Found: NONE. Code explicitly comments "VULNERABLE".
Conclusion: Critical RCE vulnerability allowing complete server compromise.
Example: Needs Investigation (Contextual)
Source Analysis: Session config doesn't set explicit domain property.
Sink Analysis: Without explicit domain, browser defaults to host-only (more restrictive).
Context Required: For single-domain apps, the default is actually MORE secure. For multi-subdomain apps sharing sessions, explicit configuration is needed.
Conclusion: Classification depends on deployment architecture—not a universal vulnerability.
AI triage classified 8 of 10 findings as confirmed TRUE_POSITIVEs requiring action, with 2 marked as contextual (NEEDS_INVESTIGATION). Zero false positives in this scan. The triage added critical context: prioritizing the RCE vulnerability, identifying the session secret as weak (not just "hardcoded"), and noting that some cookie settings depend on deployment architecture.
Additional Findings Noted During Triage
The AI triage also identified vulnerabilities NOT in the SAST report:
- IDOR on article update (missing ownership check)
- IDOR on comment delete (missing ownership check)
- Business logic flaw (direct favoritesCount manipulation)
- Hardcoded JWT secret fallback in config/index.js
These will be formally documented during AI Deep Review (Phase 6).
Document TRUE_POSITIVEs
8 TRUE_POSITIVEs confirmed from SAST triage. These will be merged with AI-discovered findings in findings.md after the deep review phase.
6Generate and Validate Checklist
Now we generate a security checklist—a reusable methodology template that doesn't depend on SAST output. The checklist defines what to check; findings (from both SAST triage and AI review) document what we found.
Generate Checklist
Read templates/CHECKLIST_TEMPLATE.md and generate a security review
checklist for this Node.js/Express/MongoDB application.
Stack details:
- Express REST API with JWT authentication
- MongoDB with Mongoose ODM
- Features: users, articles, comments, favorites, follows
Verbosity: STANDARD
Write to checklist.md
Note: The checklist is a reusable methodology template. It doesn't include SAST findings—those go directly to findings.md via the parallel SAST branch.
Validate Checklist
Use the checklist-validator subagent to review checklist.md.
Find gaps by checking against:
- OWASP Top 10
- CWE Top 25
- Node.js/Express-specific security concerns
- MongoDB injection patterns
Merge valid additions with [VALIDATION] tags.
Checklist Summary
| Category | Original | Validation Adds | Total |
|---|---|---|---|
| Input Validation | 4 | +3 (Prototype Pollution, HPP, SSRF) | 7 |
| Authentication | 4 | +2 (Account Lockout, JWT None Attack) | 6 |
| Authorization | 3 | +1 (Mass Assignment Detail) | 4 |
| Data Protection | 3 | +1 (MongoDB Connection Security) | 4 |
| API Security | 3 | +1 (ReDoS Prevention) | 4 |
| Error Handling | 2 | +1 (Unhandled Promise Rejection) | 3 |
| Configuration | 4 | — | 4 |
| Business Logic | 3 | — | 3 |
| Logging & Monitoring | 0 | +1 (NEW CATEGORY) | 1 |
| TOTAL | 26 | +10 | 36 |
Coverage Score: 6/10 (before validation) → improved with 10 additional items
Key Validation Gaps Identified
| ID | Gap | Why Missed |
|---|---|---|
| VALIDATION-01 | Security Logging & Monitoring | OWASP A09:2021 — entire category missing |
| VALIDATION-02 | Prototype Pollution | Node.js-specific (CWE-1321) |
| VALIDATION-03 | Account Lockout/Brute Force | Rate limiting covers IP, not per-account |
| VALIDATION-04 | ReDoS Prevention | Node.js single-threaded vulnerability |
| VALIDATION-05 | HTTP Parameter Pollution | Express array coercion behavior |
| VALIDATION-06 | MongoDB Connection Security | MongoDB-specific auth/TLS |
| VALIDATION-07 | SSRF Prevention | OWASP A10:2021 — assumed N/A |
| VALIDATION-08 | Mass Assignment Detail | CWE-915 — needed detailed procedures |
| VALIDATION-09 | JWT None Algorithm Attack | Known attack not explicit in AUTH-001 |
| VALIDATION-10 | Unhandled Promise Rejection | Node.js async error handling |
The checklist-validator found that the generated checklist missed an entire OWASP Top 10 category (A09 - Logging & Monitoring) and several Node.js/MongoDB-specific issues. After validation, coverage improved from 26 to 36 items, addressing gaps in prototype pollution, ReDoS, and other technology-specific vulnerabilities.
7AI Deep Review
Execute the security-scanner subagent for each checklist category, focusing on issues that pattern matching cannot detect.
Execution Prompt
Execute the security review using the subagent workflow:
1. Create plan.md from checklist.md, ordering by severity
2. For each checklist category:
a. Use security-scanner subagent to examine the codebase
b. Focus on authorization checks, business logic, data flow
c. Use findings-writer subagent to document issues
d. Update plan.md with completion status
Focus particularly on:
- IDOR potential in article/comment endpoints
- Authorization checks (or lack thereof)
- Business logic in favorites/follows
3. Use self-auditor subagent to verify completeness
AI-Discovered Vulnerabilities
Complete Findings Table
| ID | Finding | Severity | Source | Seeded? |
|---|---|---|---|---|
| F-001 | Remote Code Execution via eval() | CRITICAL | SAST + AI | SEED-03 |
| F-002 | NoSQL Injection via $where | CRITICAL | AI Only | SEED-01 |
| F-003 | Hardcoded JWT Secret (config) | HIGH | AI Only | SEED-02 |
| F-004 | Hardcoded Session Secret (app.js) | HIGH | SAST + AI | Pre-existing |
| F-005 | IDOR on Article Update | HIGH | AI Only | SEED-05 |
| F-006 | IDOR on Comment Delete | HIGH | AI Only | SEED-06 |
| F-007 | Mass Assignment | HIGH | AI Only | SEED-07 |
| F-008 | favoritesCount Manipulation | MEDIUM | AI Only | SEED-08 |
| F-009 | Missing Secure Cookie Flag | MEDIUM | SAST + AI | Pre-existing |
| F-010 | No JWT Revocation | MEDIUM | SAST + AI | Pre-existing |
| F-011 | Default Session Cookie Name | LOW | SAST + AI | Pre-existing |
| F-012 | Short Session Timeout | LOW | SAST + AI | Pre-existing |
| F-013 | Missing httpOnly Config | LOW | SAST + AI | Pre-existing |
Seeded Vulnerabilities - Detection Results
| Seed | Description | SAST | AI |
|---|---|---|---|
| SEED-01 | NoSQL Injection ($where) | ❌ No | ✅ Yes |
| SEED-02 | Hardcoded JWT Secret (config) | ❌ No | ✅ Yes |
| SEED-03 | eval() with User Input | ✅ Yes | ✅ Yes |
| SEED-04 | Regex DoS | ❌ No | ❌ No* |
| SEED-05 | IDOR: Article Update | ❌ No | ✅ Yes |
| SEED-06 | IDOR: Comment Delete | ❌ No | ✅ Yes |
| SEED-07 | Mass Assignment | ❌ No | ✅ Yes |
| SEED-08 | favoritesCount Manipulation | ❌ No | ✅ Yes |
| SEED-09 | SQL-like Logging (Safe) | ❌ No (correct) | N/A |
| SEED-10 | Validated Regex (Safe) | ❌ No (correct) | N/A |
*SEED-04 (ReDoS) requires specialized regex analysis; not in high-priority review categories.
The AI deep review discovered 7 vulnerabilities that SAST fundamentally cannot detect:
- 4 authorization issues (IDOR, mass assignment) requiring understanding of intended access control
- 2 hardcoded secrets missed due to the
||fallback pattern - 1 business logic flaw (favoritesCount manipulation) requiring understanding of data integrity rules
These require understanding what the code should do—intent that pattern matching cannot infer.
Sample AI-Only Finding: IDOR on Article Update
// VULNERABLE: No check that req.payload.id === article.author
router.put('/:article', auth.required, function(req, res, next) {
// Any authenticated user can update ANY article!
if(typeof req.body.article.title !== 'undefined'){
req.article.title = req.body.article.title;
}
// ... updates proceed without ownership verification
});
Attack Scenario: User A creates article. User B (any authenticated user) sends PUT request to modify User A's article. No 403 is returned—the article is updated.
Impact: Complete loss of data integrity. Any user can modify any article's title, description, and body.
Why SAST Missed It: There's no "dangerous pattern" here—the code is syntactically valid. Detecting this requires understanding that updates to owned resources SHOULD verify ownership.
Recommended Fix:
if (req.article.author._id.toString() !== req.payload.id.toString()) {
return res.sendStatus(403);
}
// Then proceed with updates
Self-Audit Results
The self-auditor verified completeness against the 36-item checklist. All high-priority categories were reviewed. SEED-04 (ReDoS) remained undetected as it requires specialized regex analysis not covered by standard security scanner patterns.
8SAST vs AI: Side-by-Side Comparison
With both analyses complete, we can compare what each approach found.
Detection Rate Comparison
| Category | SAST | AI | Winner |
|---|---|---|---|
| Pattern-based (eval, hardcoded strings) | 1/4 (25%) | 3/4 (75%) | AI |
| Authorization (IDOR, access control) | 0/2 (0%) | 2/2 (100%) | AI |
| Business Logic | 0/2 (0%) | 2/2 (100%) | AI |
| Regex Security (ReDoS) | 0/1 (0%) | 0/1 (0%) | Neither |
| False Positive Triggers | 0/2 flagged (correct) | N/A | Both correct |
What Each Approach Found
SAST Found (10 findings)
- ✓ eval() with string concatenation (SEED-03)
- ✓ Hardcoded session secret in app.js
- ✓ Session cookie config issues (6)
- ✓ JWT revocation not configured (2)
- ✗ Missed: NoSQL $where, config secret, IDOR, mass assignment, ReDoS
AI Found (7 additional)
- ✓ NoSQL injection via $where (SEED-01)
- ✓ Hardcoded JWT secret with || fallback (SEED-02)
- ✓ IDOR on article update (SEED-05)
- ✓ IDOR on comment delete (SEED-06)
- ✓ Mass assignment (SEED-07)
- ✓ Business logic flaw (SEED-08)
- ✗ Missed: ReDoS (requires specialized analysis)
Summary Statistics
| Metric | SAST Only | SAST + AI Triage | Full Workflow |
|---|---|---|---|
| Raw Findings | 10 | 10 | 13 |
| True Positives (Seeded) | 1 of 8 | 1 of 8 | 7 of 8 |
| True Positives (Total) | Unknown | 8 confirmed | 13 confirmed |
| False Positives | Unknown | 0 | 0 |
| SAST Missed (AI Found) | — | — | 7 |
SEED-01 (NoSQL Injection via $where) is a pattern-based vulnerability that SAST should have detected, but didn't. The AI deep review found it, we fixed it, and AI validation confirmed the fix. This single finding demonstrates why SAST alone is insufficient—even "pattern-based" vulnerabilities can slip through SAST's rules.
SAST detected 12.5% of seeded vulnerabilities (1/8). The combined workflow detected 87.5% (7/8). The only miss (ReDoS) requires specialized tooling neither standard SAST nor general AI review provides. Neither approach alone provides complete coverage—they are genuinely complementary.
9Remediation Demonstration
We demonstrate the remediation workflow on one finding from each detection category: SAST-detected and AI-detected.
SAST-Detected Fix: eval() RCE (F-001 / SEED-03)
Before (Vulnerable)
router.get('/filter', auth.optional, function(req, res, next) {
var filterExpr = req.query.expr;
// VULNERABLE: eval with user-controlled input
var filterFn = eval('(function(article) { return ' + filterExpr + '; })');
// ...
});
After (Fixed)
router.get('/filter', auth.optional, function(req, res, next) {
var filterType = req.query.type;
var filterValue = req.query.value || '';
// Predefined safe filters - no user code execution
var safeFilters = {
'hasTag': function(article) { /* safe implementation */ },
'minFavorites': function(article) { /* safe implementation */ },
'recent': function(article) { /* safe implementation */ }
};
var filterFn = safeFilters[filterType];
if (!filterFn) {
return res.status(400).json({ error: 'Invalid filter type' });
}
// ...
});
Fix Reasoning: Removed eval() entirely. Replaced with predefined safe filter functions. User input is treated as data, not code.
AI-Detected Fix: NoSQL Injection (F-002 / SEED-01)
Before (Vulnerable)
router.get('/search', auth.optional, function(req, res, next) {
var query = req.query.q;
// VULNERABLE: User input directly in $where clause
User.find({ $where: "this.username.includes('" + query + "')" })
.then(function(users) {
return res.json({ users: users.map(u => u.toProfileJSONFor()) });
}).catch(next);
});
After (Fixed)
router.get('/search', auth.optional, function(req, res, next) {
var query = req.query.q || '';
// SECURITY FIX: Replace $where with safe $regex operator
var sanitizedQuery = query.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
User.find({ username: { $regex: sanitizedQuery, $options: 'i' } })
.then(function(users) {
return res.json({ users: users.map(u => u.toProfileJSONFor()) });
}).catch(next);
});
Fix Reasoning: Replaced dangerous $where (executes JavaScript) with safe $regex (pattern matching only). Added regex character escaping to prevent regex injection.
AI-Detected Fix: IDOR (F-005 / SEED-05)
Before (Vulnerable)
router.put('/:article', auth.required, function(req, res, next) {
// VULNERABLE: No ownership check
if(typeof req.body.article.title !== 'undefined'){
req.article.title = req.body.article.title;
}
// ...
});
After (Fixed)
router.put('/:article', auth.required, function(req, res, next) {
// SECURITY FIX: Added ownership verification
if (req.article.author._id.toString() !== req.payload.id.toString()) {
return res.sendStatus(403);
}
if(typeof req.body.article.title !== 'undefined'){
req.article.title = req.body.article.title;
}
// ...
});
Fix Reasoning: Added ownership verification before allowing updates. Non-owners receive 403 Forbidden.
Verification Results
| Finding | Detection | Validation Method | Status |
|---|---|---|---|
| F-001 / SEED-03 (eval RCE) | SAST | SAST Re-scan (10→9 findings) | ✅ Fixed |
| F-002 / SEED-01 (NoSQL Injection) | AI | AI Re-scan | ✅ Fixed |
| F-005 / SEED-05 (IDOR) | AI | AI Re-scan | ✅ Fixed |
Key insight: Validation methods must match detection methods. SAST re-scans cannot verify fixes for vulnerabilities SAST never detected.
This demonstrates the full detect → fix → validate cycle for both detection methods:
- SAST-detected (SEED-03): SAST found it → We fixed it → SAST confirms fix (10→9 findings)
- AI-detected (SEED-01, SEED-05): AI found them → We fixed them → AI confirms fixes
Git branch: remediated (based on seeded-vulnerabilities)
Full log: security-review/remediation-log.md
10Lessons Learned
What Worked Well
- Parallel workflow — SAST and AI review ran independently, findings merged cleanly in
findings.md - Seeded vulnerabilities — Predictable results enabled clear comparison; 7 of 8 detected by combined workflow
- Triage value — 20% of findings required contextual analysis (NEEDS_INVESTIGATION); zero false positives confirmed
- Subagent separation — Each agent had clear responsibility: triage classifies, scanner finds, writer documents
- Checklist validation — Validator caught 10 gaps including an entire OWASP Top 10 category (A09: Logging)
What Could Improve
- SAST ruleset coverage — Semgrep's default rulesets missed NoSQL
$whereinjection and ReDoS despite being "pattern-based" issues. Custom rules or additional rulesets (MongoDB-specific, regex security) would improve detection. - Specialized analysis phases — ReDoS was the only seeded vulnerability neither SAST nor AI detected. Regex security requires specialized algorithmic analysis (NFA complexity) that pattern matching and general AI review cannot provide.
- Repository drift — The original documented repository had migrated to TypeScript/Prisma. Using forks requires verification that they match documented stack assumptions.
- Validation item prioritization — Items added by checklist-validator (like ReDoS) received less execution priority than original items.
- Unused code analysis — SEED-04's
validateBio()was defined but never called. A "dead code with security implications" check could flag latent vulnerabilities.
Unexpected Findings
- SAST detected only 12.5% of seeded vulnerabilities — Expected ~50% based on "pattern-based" classification. The
||fallback pattern for hardcoded secrets is a significant SAST blind spot. - Pre-existing vulnerabilities outnumbered seeds — The base application had 6 genuine security issues (session configuration, JWT revocation) that SAST correctly identified.
- Zero false positives — Both false positive triggers (SEED-09, SEED-10) were correctly NOT flagged by SAST, contrary to expectation that they would trigger and require triage dismissal.
Key Takeaways
SAST found 1/8 seeded vulnerabilities; combined workflow found 7/8. Neither alone is sufficient.
AI triage explains WHY findings matter, not just THAT they match patterns. Context transforms warnings into actionable items.
Keep methodology (checklist) separate from results (findings.md). The checklist is reusable; findings are project-specific.
ReDoS requires regex complexity analysis. Some vulnerability classes need purpose-built detection beyond general review.
Every classification decision is documented with reasoning. Essential for compliance and knowledge transfer.
SAST re-scans cannot verify fixes for vulnerabilities SAST never detected. AI-detected issues require AI validation.
The combined SAST + AI workflow found 7x more seeded vulnerabilities than SAST alone (87.5% vs 12.5%), while AI triage provided zero false positives and clear prioritization. The workflow is reproducible on any codebase by adjusting the stack parameters in checklist generation.
11Frequently Asked Questions
The RealWorld "Conduit" application—a Medium.com clone implemented in Node.js/Express. It includes JWT authentication, articles, comments, favorites, and follows, providing realistic attack surface while being small enough (~3,000 lines) to review completely.
Semgrep is free, open-source, runs locally without uploading code, has excellent security rule libraries, and produces clear output. It represents modern SAST capabilities while being accessible to anyone following along.
Real codebases have unpredictable security postures—some may have few issues, others many. Seeding ensures: guaranteed findings to demonstrate the workflow, a balanced comparison between SAST and AI capabilities (pattern-based vs logic-based), reproducibility for anyone following along, and clear educational examples of specific vulnerability patterns.
No. This demonstration focuses on code-level vulnerabilities. Dependency vulnerabilities (outdated packages with known CVEs) are a separate concern handled by tools like npm audit, Snyk, or Dependabot. Semgrep is a code pattern scanner, not a dependency scanner. Run npm audit separately in your CI/CD pipeline.
The sast-triage subagent examines each finding in context by tracing data flows backward to sources and forward to sinks, evaluating mitigations the pattern-matching tool couldn't see. It documents reasoning for each classification, creating an audit trail. TRUE_POSITIVEs go directly to findings.md.
The checklist is a reusable methodology template—it defines what to check, not what was found. SAST findings are specific vulnerabilities, so they go directly to findings.md alongside AI-discovered issues. This keeps the checklist clean for reuse across projects while ensuring all vulnerabilities end up in one place for remediation.
AI review found authorization issues (IDOR vulnerabilities), business logic flaws, and missing access control checks. These require understanding what the code should do versus what it does—analysis that pattern matching fundamentally cannot perform.
For a ~3,000 line application: SAST takes under 1 minute, AI triage takes 10-15 minutes, deep review takes 30-45 minutes, and remediation takes 15-20 minutes. Total is approximately 1-1.5 hours for complete coverage with documentation.
Yes. All commands, prompts, and configurations are provided. Install Semgrep, configure the subagents from Parts 2 and 3, and adjust the checklist generation prompt for your technology stack. The methodology is stack-agnostic.