Flipmap · Semantic Layer

NL2SQL Semantic Resolution Pipeline

Natural language to verified, executable SQL — end-to-end intelligence in milliseconds

Production Architecture LLM + RAG + Compiler PE-2361 · Semantic Layer MVP
🧠

Why this is hard to build

NL2SQL is a solved problem for simple schemas — but enterprise blockchain analytics requires multi-signal RAG retrieval, schema-aware generation, compiler-backed validation, and a self-improving feedback loop. Every stage below is a separate engineering challenge. SQL compilation is only one small piece of the puzzle.

5
Pipeline Phases
4
RAG Strategies
40
Agent Steps Max
0.75
Confidence Gate
User Input — Natural Language Query
"What is the market cap of USDC on Ethereum over the last 30 days?"
01
Analysis
LLM Structured Parse
1
Structural Parse
LLM decomposes NL into structured signals
2
Intent Extraction
Primary + secondary intent with confidence
3
Component Mapping
Metric signal, entity signal, domain hint
4
Filter Normalisation
Time range, value filters, limit/order
5
Canonical Form
Normalised representation for few-shot matching
6
Address Scan
Regex extraction of blockchain addresses (0x...)
7
Confidence Score
Base + ambiguity bonus → clamp [0, 1]
Output: NormalizedQuery — intent, atomic components, normalised filters, canonical form, entities, and processing trace
02
Retrieval
RAG · Hybrid Search · Embedding
Model Retrieval Per Component
1
Probe Building
Construct semantic search query from metric signal + entity + domain hint
2
Vector ANN Search
Embed probe → approximate nearest neighbours on model namespace (1024-dim)
3
BM25 Full-Text Search
Keyword matching on model metadata — names, descriptions, domains
4
Reciprocal Rank Fusion
Fuse vector + BM25 rankings → deduplicated candidate list
5
Multi-Signal Scoring
4-signal weighted blend → composite score per candidate
6
Model Selection + Fallback
Top-1 candidate or LLM-suggested model (conf ×0.9) when no results
Scoring Weights
Base score
×0.82
Domain match
×0.08
Hist. prior
×0.07
Anti-pattern
−0.03
Few-Shot Retrieval Dual Strategy
1
Canonical Embedding
Embed the normalised canonical form from Analysis phase
2
Vector Search on Gold Records
Cosine similarity against known question→query pairs
3
Jaccard Pattern Matching
Tokenised query pattern similarity as secondary signal
4
Composite Scoring
vector ×0.7 + pattern ×0.3 → ranked matches
5
Top-5 Selection
Best-matching examples with similarity scores for prompt injection
Strategy Selection
Ground Truth Service — two-pass retrieval against curated gold records
Vector Store — direct embedding search on question namespace
Post-merge: Join paths mapped across multi-model queries · Warnings flagged for missing joins · Output: RetrievalContext — per-component results, multi-model context, few-shot matches, interpretation context
03
Generation
LLM Structured Output + Validation
1
Prompt Assembly
Schema + few-shots + intent + context
2
LLM Generation
Structured output → SemanticQuery candidate
3
Schema Validation
FQN member names, valid SQL operators
4
Semantic Validation
Measures belong to resolved models
5
Completeness Check
All query components addressed
6
Filter Injection
Deterministic override of conflicting LLM clauses
7
Confidence Score
Composite weighted score → [0, 1]
Output Confidence Score
Overall = LLM self-confidence ×0.8 + FewShot similarity ×0.2  →  output range [0, 1]
Output: SemanticQuery (measures, dimensions, filters, time dimensions, segments, order, limit) + resolved models + full confidence breakdown + reasoning trace
◇ Decision Gate
Overall Confidence
≥ 0.75 ?
AGENTIC_FALLBACK_THRESHOLD
✓ YES — High Confidence → SQL Compilation ⚡ NO / SQL Error → Agentic Fallback
04a · SQL Compilation & Execution Schema Compiler · Snowflake
1
Schema Translation
SemanticQuery → compiler YAML schema with validated joins
2
Query Compilation
Measures, dimensions, filters → Snowflake SQL via dialect adapter
3
Time Resolution
Relative expressions → absolute ISO date pairs
4
Parameter Inlining
Parameterised query → runnable SQL
5
Execute Against Database
Run SQL → rows + metadata + execution time
⚠ SQL execution error → triggers Agentic Fallback →
04b · Agentic Fallback Max 40 steps
Tool-Calling Loop
list_models
search_models_rag
inspect_model
run_test_query
1
Context Injection
Agent receives: original query, normalised intent, RAG context, original SQL + error (if any)
2
Iterative Resolution
Searches models, inspects schemas, tests queries — up to 40 tool calls
3
Exit Decision
finalize — working SQL found  |  give_up — return original pipeline result
All tool-call traces + reasoning captured as learning candidates

Resolved SQL Result

SemanticQuery · Compiled SQL · Query rows & metadata · Full confidence breakdown · Timing trace

05
Learning Loop
↺ Async · Closed Loop
1
Capture Failures
SQL errors, agent traces, tool call sequences
2
Store Candidates
Enrichment suggestions with source + confidence
3
Human Review
pending → approved gate before any enrichment
4
Model Enrichment
New measures, dimensions, joins, aliases
5
Vector Store Update
Re-embed enriched models into search namespace
6
Improve Retrieval
Future queries benefit from richer models + more examples
↑   Approved enrichments feed back into Phase 02 Retrieval — directly improving model resolution and few-shot matching for all future queries