Building a Text-to-SQL Agent with GPT-4: Schema-Aware Query Generation

Why Build an Agent, Not Just a Prompt?

A simple text-to-SQL prompt generates a query and stops. An agent goes further: it extracts the schema automatically, generates the query, executes it, handles errors, retries with corrections, and formats the final answer. This agentic loop is what makes the difference between a demo and a production tool.

In this tutorial, you will build a complete text-to-SQL agent using GPT-4 that works against any PostgreSQL database.

Step 1: Schema Extraction

The first component extracts a structured schema representation from your database. This gives the LLM the context it needs to write accurate queries.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

import psycopg2
from dataclasses import dataclass

@dataclass
class ColumnInfo:
    name: str
    data_type: str
    is_nullable: bool
    is_primary_key: bool

@dataclass
class TableInfo:
    name: str
    columns: list[ColumnInfo]
    foreign_keys: list[str]

def extract_schema(conn_string: str) -> list[TableInfo]:
    """Extract full schema metadata from a PostgreSQL database."""
    conn = psycopg2.connect(conn_string)
    cur = conn.cursor()

    # Get all user tables
    cur.execute("""
        SELECT table_name FROM information_schema.tables
        WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
    """)
    tables = []

    for (table_name,) in cur.fetchall():
        # Get columns
        cur.execute("""
            SELECT column_name, data_type, is_nullable
            FROM information_schema.columns
            WHERE table_schema = 'public' AND table_name = %s
            ORDER BY ordinal_position
        """, (table_name,))
        columns = [
            ColumnInfo(name=row[0], data_type=row[1],
                       is_nullable=row[2] == "YES", is_primary_key=False)
            for row in cur.fetchall()
        ]

        # Get primary keys
        cur.execute("""
            SELECT kcu.column_name
            FROM information_schema.table_constraints tc
            JOIN information_schema.key_column_usage kcu
              ON tc.constraint_name = kcu.constraint_name
            WHERE tc.table_name = %s AND tc.constraint_type = 'PRIMARY KEY'
        """, (table_name,))
        pk_columns = {row[0] for row in cur.fetchall()}
        for col in columns:
            col.is_primary_key = col.name in pk_columns

        # Get foreign keys
        cur.execute("""
            SELECT kcu.column_name, ccu.table_name, ccu.column_name
            FROM information_schema.table_constraints tc
            JOIN information_schema.key_column_usage kcu
              ON tc.constraint_name = kcu.constraint_name
            JOIN information_schema.constraint_column_usage ccu
              ON tc.constraint_name = ccu.constraint_name
            WHERE tc.table_name = %s AND tc.constraint_type = 'FOREIGN KEY'
        """, (table_name,))
        fks = [f"{r[0]} -> {r[1]}.{r[2]}" for r in cur.fetchall()]

        tables.append(TableInfo(name=table_name, columns=columns, foreign_keys=fks))

    conn.close()
    return tables

Step 2: Schema Formatting for the Prompt

The schema needs to be formatted in a way the LLM can parse efficiently. CREATE TABLE syntax works best because LLMs have seen millions of examples during training.

def format_schema(tables: list[TableInfo]) -> str:
    """Format schema as CREATE TABLE statements."""
    output = []
    for table in tables:
        cols = []
        for c in table.columns:
            parts = [f"  {c.name} {c.data_type}"]
            if c.is_primary_key:
                parts.append("PRIMARY KEY")
            if not c.is_nullable:
                parts.append("NOT NULL")
            cols.append(" ".join(parts))
        create = f"CREATE TABLE {table.name} (\n"
        create += ",\n".join(cols)
        if table.foreign_keys:
            for fk in table.foreign_keys:
                col, ref = fk.split(" -> ")
                create += f",\n  FOREIGN KEY ({col}) REFERENCES {ref}"
        create += "\n);"
        output.append(create)
    return "\n\n".join(output)

Step 3: The Query Generation Agent

Now combine schema extraction with GPT-4 to create a full agent that generates, executes, and retries queries.

import openai
import json

class TextToSQLAgent:
    def __init__(self, conn_string: str):
        self.conn_string = conn_string
        self.client = openai.OpenAI()
        self.schema = format_schema(extract_schema(conn_string))
        self.max_retries = 3

    def generate_sql(self, question: str, error_context: str = "") -> str:
        messages = [
            {
                "role": "system",
                "content": f"""You are a PostgreSQL expert. Convert the user's
question into a valid SQL query using this schema:

{self.schema}

Rules:
- Return ONLY the SQL query
- Use PostgreSQL syntax
- Always use table aliases for JOINs
- Limit results to 100 rows unless specified otherwise
{f"Previous error: {error_context}" if error_context else ""}""",
            },
            {"role": "user", "content": question},
        ]
        response = self.client.chat.completions.create(
            model="gpt-4o", messages=messages, temperature=0
        )
        sql = response.choices[0].message.content.strip()
        # Strip markdown code fences if present
        if sql.startswith("~~~"):
            sql = sql.split("\n", 1)[1].rsplit("~~~", 1)[0].strip()
        return sql

    def execute(self, sql: str) -> list[dict]:
        conn = psycopg2.connect(self.conn_string)
        cur = conn.cursor()
        cur.execute(sql)
        columns = [desc[0] for desc in cur.description]
        rows = [dict(zip(columns, row)) for row in cur.fetchall()]
        conn.close()
        return rows

    def ask(self, question: str) -> dict:
        error_context = ""
        for attempt in range(self.max_retries):
            sql = self.generate_sql(question, error_context)
            try:
                results = self.execute(sql)
                return {"sql": sql, "results": results, "attempts": attempt + 1}
            except Exception as e:
                error_context = f"Query: {sql}\nError: {str(e)}"
        return {"sql": sql, "error": error_context, "attempts": self.max_retries}

Step 4: Using the Agent

agent = TextToSQLAgent("postgresql://user:pass@localhost/sales_db")

answer = agent.ask("Which salesperson had the highest revenue last quarter?")
print(f"SQL: {answer['sql']}")
print(f"Results: {json.dumps(answer['results'], indent=2)}")
print(f"Resolved in {answer['attempts']} attempt(s)")

The retry mechanism is critical. In practice, roughly 10-15% of first-attempt queries contain minor errors that the LLM can self-correct when given the error message as context.

FAQ

Why use CREATE TABLE format instead of JSON for schema context?

LLMs have been trained on vastly more SQL DDL than structured JSON schema descriptions. Using CREATE TABLE statements consistently produces higher accuracy because the model can directly pattern-match against its training data. Benchmarks show 3-5% accuracy improvement with DDL format.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

How do I handle very large schemas with hundreds of tables?

For large schemas, use a two-stage approach: first ask the LLM to identify which tables are relevant to the question, then provide only those tables in the generation prompt. This keeps the context window manageable and improves accuracy by reducing noise.

Should I use GPT-4 or GPT-4o for text-to-SQL?

GPT-4o is recommended for most use cases. It offers comparable SQL generation accuracy to GPT-4 at significantly lower cost and latency. For extremely complex queries involving multiple CTEs or window functions, GPT-4 may produce slightly better results, but the difference is usually within 2-3%.

#TextToSQL #GPT4 #SQLAgent #SchemaExtraction #OpenAI #AgenticAI #DatabaseAutomation #PythonSQL

Building a Text-to-SQL Agent with GPT-4: Schema-Aware Query Generation

Why Build an Agent, Not Just a Prompt?

Step 1: Schema Extraction

Step 2: Schema Formatting for the Prompt

Step 3: The Query Generation Agent

Step 4: Using the Agent

FAQ

Why use CREATE TABLE format instead of JSON for schema context?

How do I handle very large schemas with hundreds of tables?

Should I use GPT-4 or GPT-4o for text-to-SQL?

Try CallSphere AI Voice Agents

Related Articles You May Like

Agent GPT Explained: What These AI Agents Actually Do in 2026

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

OpenAI Frontier: Model-Native Orchestration Is the Default in 2026

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Model-Native Harness: Why OpenAI and Anthropic Are Killing ReAct Loops