TL;DR — Regex blocklists are bypassed by comment wrapping and Unicode tricks; a single validation layer never covers all failure classes. Chain five deterministic layers: statement-type allowlist → AST structural check → semantic schema binding → policy injection → read-only DB connection. Add a SQL Compile Gate (PARSEONLY / EXPLAIN) before execution and cryptographically-signed audit logs for SOC 2 / HIPAA. All validation must run after LLM generation, in code — never delegated to a second LLM call [1] [3].
Why Single-Layer Validation Fails
Regex keyword blocklists are the most common first approach — and the most commonly bypassed. Classical SQL injection detection techniques drop from ~98% accuracy on standard benchmarks to 60% against LLM-generated queries [1], because comment-wrapped commands (/* DROP */ TABLE users), Unicode homoglyphs, and semicolon-delimited multi-statement sequences all evade pattern matching.
A 50,000-query production evaluation found a more fundamental problem: most broken LLM queries execute successfully and return data [2]. The failure distribution:
| Failure class | Description | Caught by |
|---|---|---|
| Schema hallucination | Column/table exists in training data, not in this DB | Semantic binding |
| Wrong join path | Fabricated FK relationships between unrelated tables | Semantic binding |
| Missing required filter | Omits status='active', tenant_id, date range |
Policy injection |
| Wrong table selection | payments instead of revenue_recognition |
Semantic binding |
| Unsafe statement type | DELETE, DROP, ALTER generated from ambiguous intent |
Statement allowlist |
| Syntax errors | Unbalanced parens, invalid keywords | Parser / Compile Gate |
Syntax errors — the only class reliably caught by a basic parser — are the least common. Intent errors masquerading as working queries are the dominant risk [2]. Each layer below targets a distinct failure class.
The Five-Layer Validation Pipeline
Layer 0 — Intent Classification (Pre-SQL)
Before generating SQL at all, classify whether the user’s message warrants a database query. Off-topic, ambiguous, or malformed inputs should be refused early [3].
The LatentRefusal mechanism [4] pushes this further: a lightweight probe on the LLM’s intermediate hidden activations intercepts unanswerable or schema-mismatched queries before any SQL tokens are emitted. Its Tri-Residual Gated Encoder suppresses schema noise and amplifies question-schema mismatch signals, adding ~2ms overhead while achieving 88.5% refusal F1 across four benchmarks [4]. The complementary “Query Carefully” [18] approach detects ambiguous queries at parse time before execution. Both gates are single-pass, pre-generation, and execution-free.
Layer 1 — Statement-Type Allowlist
Parse the generated SQL and enforce a default-deny policy on statement types [5]:
- Block:
DROP,TRUNCATE,ALTER,GRANT,REVOKE,CREATE,INSERT,UPDATE,DELETE - Block: multi-statement queries (semicolon-delimited sequences)
- Block: stored procedure calls,
EXEC, dynamic SQL constructors - Allow:
SELECTonly (for read-only agents)
Regex is insufficient here because of the bypass patterns above. Use a real parser: sqlglot ⭐ 9.3k (Jun 2026) traverses the AST to identify top-level statement type programmatically [6].
Layer 2 — AST Structural Validation
Traverse the parse tree to validate structure against what this user is authorized to see [7]:
- Table allowlist: every
FROM/JOINreference must match the user’s role-scoped table set - Column allowlist: every column in
SELECT,WHERE,JOIN ON, subqueries, and CTEs must be in the user’s per-role column list — catches hallucinated fields immediately - Sensitive column blocklist: columns tagged PII / financial are rejected at projection level, even when buried in CTEs or derived expressions [8]
# Minimal example: block forbidden types, check allowed tables via sqlglot AST
import sqlglot
ALLOWED_TABLES = {"orders", "products", "customers"}
FORBIDDEN_TYPES = {
sqlglot.exp.Drop, sqlglot.exp.Delete,
sqlglot.exp.Update, sqlglot.exp.Insert,
}
def validate(sql: str) -> tuple[bool, str]:
parsed = sqlglot.parse_one(sql)
if type(parsed) in FORBIDDEN_TYPES:
return False, f"Forbidden statement type: {type(parsed).__name__}"
tables = {t.name.lower() for t in parsed.find_all(sqlglot.exp.Table)}
if not tables.issubset(ALLOWED_TABLES):
blocked = tables - ALLOWED_TABLES
return False, f"Unauthorized tables: {blocked}"
return True, "ok"
sqlglot ⭐ 9.3k handles 31 SQL dialects and supports AST mutation via transform() [6]. For simpler tokenization, sqlparse ⭐ 3.9k is lighter but is explicitly a non-validating parser that will miss edge-case constructs [9]. For PostgreSQL-native pipelines, pg_query wraps libpg_query for exact PG parse trees [19].
Layer 3 — Semantic / Schema-Binding Validation
The AST check validates shape; semantic validation validates meaning against the real schema [10]. A catalog-aware validator resolves every reference against live database metadata:
| Semantic check | Error code | What it catches |
|---|---|---|
| Column resolution | UNKNOWN_COLUMN |
Hallucinated fields not in current schema |
| Ambiguity detection | AMBIGUOUS_COLUMN |
Unqualified names present in multiple joined tables |
| Type compatibility | TYPE_MISMATCH |
Function args with wrong types, dialect mismatches |
| Required filter enforcement | MISSING_REQUIRED_FILTER |
Missing tenant_id, status, date range |
| Join predicate check | CARTESIAN_JOIN |
Many-to-many cross joins without predicates |
| Sensitive field exposure | RESTRICTED_COLUMN |
PII in projections, filters, or expressions |
The validator returns structured codes — allow, deny, warn, or repair — with repair hints the LLM can safely act on [10].
SQL Compile Gate — engine-native dry-run before execution [7] [11]:
- SQL Server:
SET PARSEONLY ON(parse phase only) orSET SHOWPLAN_XML ON(compile + plan, no execute) — validates column existence and user permissions without running the query - PostgreSQL / BigQuery / Snowflake:
EXPLAIN— rejects insane cost estimates and full scans on large tables - Engine-native validation catches edge cases (dialect-specific syntax, permission grants) that custom parsers miss
Layer 4 — Policy Injection
Do not trust the LLM to consistently include required business-logic filters. Mechanically rewrite the AST to inject non-negotiable constraints before execution [7]:
- Tenant isolation: append
WHERE tenant_id = :ctx.tenantto every top-level query - Time-window filters: inject
AND created_at >= :ctx.start_date - Status filters: add
AND status = 'active'where schema requires it
Policies are evaluated against the parsed AST, never against the raw prompt text — a regex-matched tenant_id in the user’s message is not the same as an enforced predicate in the generated SQL [5]. sqlglot’s transform() API enables programmatic AST mutation that survives alias and CTE resolution [6].
Layer 5 — Database-Level Enforcement
Application layers 1–4 are the primary controls. Layer 5 is the backstop if any prior layer has a bug [12]:
- Read-only DB user:
GRANT SELECTonly — cannot executeINSERT,UPDATE,DELETEregardless of LLM output - Row-Level Security: database-enforced tenant isolation regardless of query structure (PostgreSQL RLS, SQL Server RLS) [17]
- Read replica: route LLM queries to an analytics replica; production writes are physically isolated [3]
- Result row cap:
LIMIT 1000prevents resource exhaustion from missingLIMITclauses [3]
RLS limitations — understand the ceiling [5] [13]: RLS cannot enforce column-level redaction on SELECT *, cannot block DDL statements when the user has valid grants, and has exploitable CVEs — CVE-2024-10976 (RLS bypass below subqueries in PostgreSQL) and CVE-2025-8713 (optimizer statistics leaking RLS-hidden rows) [13]. RLS is defense-in-depth, not the primary control.
Evidence Logging
Every validation decision — accepted or rejected — must produce a cryptographically-signed, timestamped record [5]:
- User identity and role context
- Natural language input
- Generated SQL (as-generated, before policy injection)
- Policy decisions, violation codes, and repair attempts
- Final executed SQL and result row count
- Which layers accepted / rejected the query
Required for SOC 2, HIPAA, and forensic analysis. Log the LLM prompt alongside the SQL so backdoor attacks can be traced to their trigger inputs [16].
Emerging Threats
Backdoor Attacks (ToxicSQL)
The ToxicSQL framework [14] (SIGMOD 2026 [15]) shows that poisoning 0.44% of training data achieves a 79.41% attack success rate for embedding backdoors that generate predefined malicious SQL on trigger inputs while maintaining normal behavior on clean queries [14]. Stealthy semantic and character-level triggers make poisoned samples hard to detect in training pipelines.
⚠ Implication: fine-tuned models are not inherently safer than base models. Application-layer validation remains mandatory regardless of model provenance or fine-tuning source.
Repair Loop Escapes
When generated SQL fails validation, returning the error to the LLM for repair can produce queries that remove safety filters to avoid the error. Mitigations [8]:
- Re-validate every repaired query with identical strictness — no exceptions for repair attempts
- Treat policy violations and syntax errors as separate error classes: return policy violations to the application logic, not to the LLM
- Cap repair iterations (2–3 max); escalate to human review if retries fail
Schema Curation as Proactive Reduction
Validation is easier if the LLM cannot reference unauthorized objects in the first place [7]:
- Strip sensitive columns from DDL in the system prompt —
email,hashed_password,ssnnever appear in schema context [12] - Use a vector index of table definitions tagged by access role; the model only receives schema for tables the user’s role can see [7]
- Never pass production row data to the LLM — only format specifications and low-cardinality vocabularies
Schema curation reduces obvious hallucinations but does not replace post-generation validation: LLMs still produce probabilistic output, and that output still requires all five layers.
Tools Reference
| Library | AST support | Dialects | Stars | Best for |
|---|---|---|---|---|
| sqlglot | Full | 31 dialects | ⭐ 9.3k | Cross-dialect traversal, mutation, type inference |
| sqlparse | Partial | DB-agnostic | ⭐ 3.9k | Lightweight tokenization; non-validating |
| pg_query | Full (PG) | PostgreSQL only | — | Exact PG parse tree via libpg_query |
| sqlfluff | No | Linting only | — | SQL style / dialect linting, not safety gates |
Implementation: Crawl → Walk → Run
Start constrained and expand based on observed patterns — the use cases you don’t anticipate are the ones that will expose you [3] [20]:
- Crawl — expose 2–3 specific business queries, hard-code allowed tables/columns, manually review generated SQL in staging for 2 weeks
- Walk — expand to a role-scoped schema subset, add automated AST validation and semantic binding, enable audit logging
- Run — roll out across roles with monitoring; add the SQL Compile Gate and anomaly alerts on the audit log stream
Do not skip the Crawl phase. A narrow, manually-reviewed allow-set is far safer than a broad schema with “should-be-fine” LLM constraints.