Vanna AI — Deep Dive

Architecture

How Vanna AI Works

Train on Schema

Feed Vanna your DDL statements, documentation, business terminology, and example SQL queries. This builds a vector store of "known patterns."

→

Ask in English

User submits a natural-language question. Vanna embeds the question and searches the vector store for the 10 most semantically similar training examples.

→

Generate SQL

Retrieved context + question are passed to an LLM (GPT-4, Claude, Gemini, Ollama, etc.). The LLM synthesizes correct SQL tailored to your schema.

→

Execute & Learn

SQL runs against your DB. Results are shown as tables or charts. Successful pairs are saved back to Tool Memory — the model keeps improving.

RAG Pipeline — Retrieval-Augmented Generation Flow

💬

User Question

Natural language

→

🔢

Embedding

Vectorize query

→

🗄️

Vector Store

ChromaDB / Qdrant

→

📋

Retrieved Context

Top-10 similar SQL

→

🤖

LLM

GPT-4 / Claude / Ollama

→

✅

SQL Query

Executed on DB

Training Example

import vanna
from vanna.openai import OpenAI_Chat
from vanna.chromadb import ChromaDB_VectorStore

# 1. Initialize with your LLM + vector store
class MyVanna(ChromaDB_VectorStore, OpenAI_Chat):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        OpenAI_Chat.__init__(self, config=config)

vn = MyVanna(config={'api_key': 'sk-...', 'model': 'gpt-4'})

# 2. Connect to your database
vn.connect_to_postgres(host='localhost', dbname='mydb', ...)

# 3. Train on your schema (run once)
vn.train(ddl="CREATE TABLE orders (id INT, amount DECIMAL, user_id INT, ...)")
vn.train(documentation="OTIF = % of orders delivered on time and in full")
vn.train(sql="SELECT user_id, SUM(amount) FROM orders GROUP BY user_id")

# 4. Ask questions in plain English
sql = vn.generate_sql("What were our top 10 products by revenue last month?")
df  = vn.run_sql(sql)
fig = vn.get_plotly_figure(df=df, question=...)

Integration Compatibility

Category	Supported Options	Notes
LLMs	GPT-4oClaudeGemini OllamaLLaMAMistral	Any OpenAI-compatible API or local model
Vector Stores	ChromaDBQdrantPinecone WeaviatePgVector	Bring your own or use hosted Vanna metadata
Databases	PostgreSQLMySQLSnowflake BigQuerySQLiteDuckDB RedshiftMSSQL	Any SQL-speaking DB via connectors
Frontends	JupyterStreamlitFlask FastAPISlack BotReact	<vanna-chat> web component available
Auth / Security	JWTOAuthRow-Level Security SOC 2	Vanna 2.0: user-scoped execution built-in

Applications

Real-World Use Cases

📊

Business Intelligence

Replace waiting for analyst tickets. Non-technical stakeholders query data directly from BI dashboards using chat.

"What's our MoM revenue growth this quarter?"

"Compare conversion rates by traffic source"

🛒

E-Commerce Analytics

Ops and marketing teams self-serve product, inventory, and campaign performance data without writing SQL.

"Top 10 products by sales last month?"

"Which SKUs have low stock and high demand?"

🏥

Healthcare & Compliance

Query patient or operational data with full audit trails. Row-level security ensures users only see authorized data.

"Admissions by department this week?"

"Show overdue follow-ups in region X"

🚀

Startup / Founder Use

Solo founders and small teams get fast, self-serve access to core business metrics without hiring a data team.

"Active users last 30 days?"

"Churn rate this month vs last month"

⚙️

Developer Productivity

Accelerate SQL development cycles. Quickly prototype complex queries, then refine — ideal for data engineers and backend devs.

"Generate a query for cohort retention analysis"

"Find all orders with NULL shipping address"

🏢

Enterprise Internal Tools

Embed Vanna as an internal data assistant — deployed on your own infrastructure, with your own auth and security policies.

"OTIF score by supplier last quarter?"

"Headcount breakdown by cost center"

Evaluation

Pros & Cons

✓ Strengths

Open-Source & Self-Hostable

Full code is on GitHub. Run it entirely on your own infra — no vendor lock-in, full data sovereignty. Ideal for regulated industries.

LLM-Agnostic

Plug in GPT-4, Claude, Gemini, or a locally-hosted Ollama model. You're not tied to any one AI provider or cost structure.

Self-Improving via Tool Memory

Every successful query pair gets saved. The model learns your schema idioms over time — accuracy compounds with usage.

Data Privacy by Design

Actual DB contents are never sent to the LLM. Only schema, docs, and query patterns leave your environment — a critical enterprise differentiator.

Rich Production UI

Vanna 2.0 ships a <vanna-chat> web component with streaming tables, charts, CSV export, dark/light themes, and mobile support out of the box.

Enterprise Security Built-In

User-scoped execution, row-level security, audit logs, rate limiting, and quota management are first-class features in v2 — not afterthoughts.

Wide Database Compatibility

Native connectors for PostgreSQL, MySQL, Snowflake, BigQuery, SQLite, DuckDB, Redshift, and more. Custom connectors possible for anything else.

✗ Weaknesses

Quality Depends on Training Data

Accuracy is only as good as the DDL, docs, and SQL pairs you've provided. Sparse training = poor or hallucinated queries on edge cases.

Steep Initial Setup for Self-Hosting

Configuring vector stores, choosing LLMs, writing training data, and securing the stack requires meaningful engineering time — not plug-and-play.

Struggles with Very Complex SQL

Multi-step CTEs, recursive queries, or deeply nested joins may not generate accurately — especially without closely-matched training examples.

Free Tier Rate Limits

The hosted free plan is capped. Teams with meaningful query volumes will need to self-host or pay — the $0.53/query hosted pricing can add up.

No GUI for Training Data Management

Adding, reviewing, or removing training data is code-first. Non-technical admins can't easily curate the knowledge base without developer help.

SQL Transparency Required

Generated SQL is shown to users — which is good for trust, but means non-technical users may be confused or intimidated by query output.

Hallucination Risk on Unfamiliar Schemas

Without sufficient schema context, the LLM may invent table or column names. Always validate SQL before exposing results to production users.

Field Notes

What You'll Notice in Practice

⚡ Behavior

The "Cold Start" Problem is Real

Out of the box, Vanna with just a DDL schema produces mediocre results. The first 50–100 question/SQL training pairs make the biggest accuracy jump. Think of it like onboarding a new analyst: it's an investment, not instant magic.

⚠️ Watch Out

Ambiguous Business Terms = Wrong SQL

"Revenue" means different things across teams. If you don't define it in vn.train(documentation=...), the model will guess — and guess differently each time. Invest in a business glossary as training data.

💡 Insight

RAG Beats Fine-Tuning for SQL

The retrieval-first approach means Vanna can adapt to schema changes without retraining — just add new DDL. This is a key advantage over fine-tuned models where schema drift requires expensive re-runs.

✅ Best Practice

Train with "Negative" Examples Too

Beyond correct SQL, document common misinterpretations. Adding documentation like "do NOT include test users (role='test') in user counts" dramatically reduces the most frustrating recurring mistakes.

⚠️ Watch Out

LLM Choice Has a Major Impact

GPT-4o and Claude consistently outperform GPT-3.5 for complex schema understanding. For local deployment, Mistral 7B handles simple queries well, but struggles with JOINs across 5+ tables. Budget accordingly.

⚡ Behavior

Every Query Exports a CSV

One underrated UX feature: every query result automatically generates a downloadable CSV. Business users who don't understand the SQL still get tangible, exportable output they can use in Excel — reducing friction significantly.

💡 Insight

Vanna 2.0 Is a Framework Shift

The jump from v1 to Vanna 2.0 wasn't just features — it's an architectural rewrite. V2 is now an agent framework with Tool Memory, lifecycle hooks, and multi-user security. It's production-grade in ways v1 never was.

✅ Best Practice

Start in Jupyter, Ship with FastAPI

The recommended workflow: prototype in a Jupyter notebook (tight feedback loop for training and testing), then graduate to a FastAPI/Streamlit app with the <vanna-chat> web component for business users.

Final Verdict

Vanna AI is one of the most compelling Text-to-SQL solutions in production today — not because it's magic, but because it's honest about being a learnable system. The RAG architecture means it improves with investment, security is taken seriously from day one, and the open-source model means you're never locked in. The real tradeoff: it rewards teams willing to invest in training data, and punishes those who don't.

★ 4.5

Overall Rating

12k+

GitHub Stars

Any LLM

Model Support

v2.0

Current Version

Free Tier

Entry Point