Skip to content
Learn Agentic AI
Learn Agentic AI13 min read3 views

Building a Text-to-SQL Agent with GPT-4: Schema-Aware Query Generation

Build a complete text-to-SQL agent using GPT-4 that extracts database schemas, generates SQL queries from natural language, executes them safely, and formats results for end users.

Why Build an Agent, Not Just a Prompt?

A simple text-to-SQL prompt generates a query and stops. An agent goes further: it extracts the schema automatically, generates the query, executes it, handles errors, retries with corrections, and formats the final answer. This agentic loop is what makes the difference between a demo and a production tool.

In this tutorial, you will build a complete text-to-SQL agent using GPT-4 that works against any PostgreSQL database.

Step 1: Schema Extraction

The first component extracts a structured schema representation from your database. This gives the LLM the context it needs to write accurate queries.

flowchart TD
    START["Building a Text-to-SQL Agent with GPT-4: Schema-A…"] --> A
    A["Why Build an Agent, Not Just a Prompt?"]
    A --> B
    B["Step 1: Schema Extraction"]
    B --> C
    C["Step 2: Schema Formatting for the Prompt"]
    C --> D
    D["Step 3: The Query Generation Agent"]
    D --> E
    E["Step 4: Using the Agent"]
    E --> F
    F["FAQ"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
import psycopg2
from dataclasses import dataclass

@dataclass
class ColumnInfo:
    name: str
    data_type: str
    is_nullable: bool
    is_primary_key: bool

@dataclass
class TableInfo:
    name: str
    columns: list[ColumnInfo]
    foreign_keys: list[str]

def extract_schema(conn_string: str) -> list[TableInfo]:
    """Extract full schema metadata from a PostgreSQL database."""
    conn = psycopg2.connect(conn_string)
    cur = conn.cursor()

    # Get all user tables
    cur.execute("""
        SELECT table_name FROM information_schema.tables
        WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
    """)
    tables = []

    for (table_name,) in cur.fetchall():
        # Get columns
        cur.execute("""
            SELECT column_name, data_type, is_nullable
            FROM information_schema.columns
            WHERE table_schema = 'public' AND table_name = %s
            ORDER BY ordinal_position
        """, (table_name,))
        columns = [
            ColumnInfo(name=row[0], data_type=row[1],
                       is_nullable=row[2] == "YES", is_primary_key=False)
            for row in cur.fetchall()
        ]

        # Get primary keys
        cur.execute("""
            SELECT kcu.column_name
            FROM information_schema.table_constraints tc
            JOIN information_schema.key_column_usage kcu
              ON tc.constraint_name = kcu.constraint_name
            WHERE tc.table_name = %s AND tc.constraint_type = 'PRIMARY KEY'
        """, (table_name,))
        pk_columns = {row[0] for row in cur.fetchall()}
        for col in columns:
            col.is_primary_key = col.name in pk_columns

        # Get foreign keys
        cur.execute("""
            SELECT kcu.column_name, ccu.table_name, ccu.column_name
            FROM information_schema.table_constraints tc
            JOIN information_schema.key_column_usage kcu
              ON tc.constraint_name = kcu.constraint_name
            JOIN information_schema.constraint_column_usage ccu
              ON tc.constraint_name = ccu.constraint_name
            WHERE tc.table_name = %s AND tc.constraint_type = 'FOREIGN KEY'
        """, (table_name,))
        fks = [f"{r[0]} -> {r[1]}.{r[2]}" for r in cur.fetchall()]

        tables.append(TableInfo(name=table_name, columns=columns, foreign_keys=fks))

    conn.close()
    return tables

Step 2: Schema Formatting for the Prompt

The schema needs to be formatted in a way the LLM can parse efficiently. CREATE TABLE syntax works best because LLMs have seen millions of examples during training.

flowchart LR
    S0["Step 1: Schema Extraction"]
    S0 --> S1
    S1["Step 2: Schema Formatting for the Prompt"]
    S1 --> S2
    S2["Step 3: The Query Generation Agent"]
    S2 --> S3
    S3["Step 4: Using the Agent"]
    style S0 fill:#4f46e5,stroke:#4338ca,color:#fff
    style S3 fill:#059669,stroke:#047857,color:#fff
def format_schema(tables: list[TableInfo]) -> str:
    """Format schema as CREATE TABLE statements."""
    output = []
    for table in tables:
        cols = []
        for c in table.columns:
            parts = [f"  {c.name} {c.data_type}"]
            if c.is_primary_key:
                parts.append("PRIMARY KEY")
            if not c.is_nullable:
                parts.append("NOT NULL")
            cols.append(" ".join(parts))
        create = f"CREATE TABLE {table.name} (\n"
        create += ",\n".join(cols)
        if table.foreign_keys:
            for fk in table.foreign_keys:
                col, ref = fk.split(" -> ")
                create += f",\n  FOREIGN KEY ({col}) REFERENCES {ref}"
        create += "\n);"
        output.append(create)
    return "\n\n".join(output)

Step 3: The Query Generation Agent

Now combine schema extraction with GPT-4 to create a full agent that generates, executes, and retries queries.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

import openai
import json

class TextToSQLAgent:
    def __init__(self, conn_string: str):
        self.conn_string = conn_string
        self.client = openai.OpenAI()
        self.schema = format_schema(extract_schema(conn_string))
        self.max_retries = 3

    def generate_sql(self, question: str, error_context: str = "") -> str:
        messages = [
            {
                "role": "system",
                "content": f"""You are a PostgreSQL expert. Convert the user's
question into a valid SQL query using this schema:

{self.schema}

Rules:
- Return ONLY the SQL query
- Use PostgreSQL syntax
- Always use table aliases for JOINs
- Limit results to 100 rows unless specified otherwise
{f"Previous error: {error_context}" if error_context else ""}""",
            },
            {"role": "user", "content": question},
        ]
        response = self.client.chat.completions.create(
            model="gpt-4o", messages=messages, temperature=0
        )
        sql = response.choices[0].message.content.strip()
        # Strip markdown code fences if present
        if sql.startswith("~~~"):
            sql = sql.split("\n", 1)[1].rsplit("~~~", 1)[0].strip()
        return sql

    def execute(self, sql: str) -> list[dict]:
        conn = psycopg2.connect(self.conn_string)
        cur = conn.cursor()
        cur.execute(sql)
        columns = [desc[0] for desc in cur.description]
        rows = [dict(zip(columns, row)) for row in cur.fetchall()]
        conn.close()
        return rows

    def ask(self, question: str) -> dict:
        error_context = ""
        for attempt in range(self.max_retries):
            sql = self.generate_sql(question, error_context)
            try:
                results = self.execute(sql)
                return {"sql": sql, "results": results, "attempts": attempt + 1}
            except Exception as e:
                error_context = f"Query: {sql}\nError: {str(e)}"
        return {"sql": sql, "error": error_context, "attempts": self.max_retries}

Step 4: Using the Agent

agent = TextToSQLAgent("postgresql://user:pass@localhost/sales_db")

answer = agent.ask("Which salesperson had the highest revenue last quarter?")
print(f"SQL: {answer['sql']}")
print(f"Results: {json.dumps(answer['results'], indent=2)}")
print(f"Resolved in {answer['attempts']} attempt(s)")

The retry mechanism is critical. In practice, roughly 10-15% of first-attempt queries contain minor errors that the LLM can self-correct when given the error message as context.

FAQ

Why use CREATE TABLE format instead of JSON for schema context?

LLMs have been trained on vastly more SQL DDL than structured JSON schema descriptions. Using CREATE TABLE statements consistently produces higher accuracy because the model can directly pattern-match against its training data. Benchmarks show 3-5% accuracy improvement with DDL format.

How do I handle very large schemas with hundreds of tables?

For large schemas, use a two-stage approach: first ask the LLM to identify which tables are relevant to the question, then provide only those tables in the generation prompt. This keeps the context window manageable and improves accuracy by reducing noise.

Should I use GPT-4 or GPT-4o for text-to-SQL?

GPT-4o is recommended for most use cases. It offers comparable SQL generation accuracy to GPT-4 at significantly lower cost and latency. For extremely complex queries involving multiple CTEs or window functions, GPT-4 may produce slightly better results, but the difference is usually within 2-3%.


#TextToSQL #GPT4 #SQLAgent #SchemaExtraction #OpenAI #AgenticAI #DatabaseAutomation #PythonSQL

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

Building Voice Agents with the OpenAI Realtime API: Full Tutorial

Hands-on tutorial for building voice agents with the OpenAI Realtime API — WebSocket setup, PCM16 audio, server VAD, and function calling.

Technical Guides

How AI Voice Agents Actually Work: Technical Deep Dive (2026 Edition)

A full technical walkthrough of how modern AI voice agents work — speech-to-text, LLM orchestration, TTS, tool calling, and sub-second latency.

Technical Guides

Voice AI Latency: Why Sub-Second Response Time Matters (And How to Hit It)

A technical breakdown of voice AI latency budgets — STT, LLM, TTS, network — and how to hit sub-second end-to-end response times.

AI Interview Prep

8 AI System Design Interview Questions Actually Asked at FAANG in 2026

Real AI system design interview questions from Google, Meta, OpenAI, and Anthropic. Covers LLM serving, RAG pipelines, recommendation systems, AI agents, and more — with detailed answer frameworks.

AI Interview Prep

8 LLM & RAG Interview Questions That OpenAI, Anthropic & Google Actually Ask

Real LLM and RAG interview questions from top AI labs in 2026. Covers fine-tuning vs RAG decisions, production RAG pipelines, evaluation, PEFT methods, positional embeddings, and safety guardrails with expert answers.

AI Interview Prep

7 ML Fundamentals Questions That Top AI Companies Still Ask in 2026

Real machine learning fundamentals interview questions from OpenAI, Google DeepMind, Meta, and xAI in 2026. Covers attention mechanisms, KV cache, distributed training, MoE, speculative decoding, and emerging architectures.