---
title: "Text-to-SQL Error Correction: Self-Healing Queries That Fix Their Own Mistakes"
description: "Build a text-to-SQL system with automatic error detection, self-correction loops, and intelligent retry strategies that fix query mistakes without user intervention."
canonical: https://callsphere.ai/blog/text-to-sql-error-correction-self-healing-queries
category: "Learn Agentic AI"
tags: ["Error Correction", "Self-Healing", "Text-to-SQL", "Query Retry", "LLM Agents"]
author: "CallSphere Team"
published: 2026-03-18T00:00:00.000Z
updated: 2026-05-08T12:41:07.590Z
---

# Text-to-SQL Error Correction: Self-Healing Queries That Fix Their Own Mistakes

> Build a text-to-SQL system with automatic error detection, self-correction loops, and intelligent retry strategies that fix query mistakes without user intervention.

## Why Error Correction Matters

Even the best text-to-SQL systems produce invalid queries 10-20% of the time. Column name mismatches, syntax errors, incorrect data type comparisons, and ambiguous GROUP BY clauses are common failure modes. Without error correction, these failures become dead ends for the user.

A self-healing query system catches execution errors, feeds them back to the LLM with the original context, and asks for a corrected version. This simple loop recovers from 60-80% of first-attempt failures, pushing overall system accuracy from 85% to 95%+.

## Error Categories in AI-Generated SQL

Understanding error types helps you build targeted correction strategies:

```mermaid
flowchart TD
    CALL(["Inbound Call"])
    HEALTH{"Primary
agent healthy?"}
    PRIMARY["Primary agent
LLM provider A"]
    SECONDARY["Hot standby
LLM provider B"]
    QUEUE[("Persisted
call state")]
    HUMAN(["Live human
fallback"])
    DONE(["Caller served"])
    CALL --> HEALTH
    HEALTH -->|Yes| PRIMARY
    HEALTH -->|Timeout or 5xx| SECONDARY
    PRIMARY --> QUEUE
    SECONDARY --> QUEUE
    PRIMARY --> DONE
    SECONDARY --> DONE
    SECONDARY -->|Both fail| HUMAN
    style HEALTH fill:#f59e0b,stroke:#d97706,color:#1f2937
    style PRIMARY fill:#4f46e5,stroke:#4338ca,color:#fff
    style SECONDARY fill:#0ea5e9,stroke:#0369a1,color:#fff
    style HUMAN fill:#dc2626,stroke:#b91c1c,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
```

**Syntax errors** — Missing parentheses, incorrect keyword order, invalid function names. These produce immediate database errors with clear messages.

**Schema errors** — Referencing non-existent tables or columns. The error message includes the problematic identifier, making correction straightforward.

**Type errors** — Comparing incompatible types, like `WHERE date_column = 42`. These may produce errors or silently return wrong results.

**Logic errors** — Valid SQL that returns incorrect results. The query runs successfully but answers the wrong question. These are the hardest to detect.

## Building the Self-Correction Loop

```python
import openai
import psycopg2
from dataclasses import dataclass
from typing import Optional

@dataclass
class QueryAttempt:
    sql: str
    error: Optional[str]
    results: Optional[list[dict]]

class SelfHealingSQLAgent:
    def __init__(self, conn_string: str, schema: str):
        self.conn_string = conn_string
        self.schema = schema
        self.client = openai.OpenAI()
        self.max_retries = 3

    def _generate(self, question: str,
                  previous_attempts: list[QueryAttempt]) -> str:
        """Generate SQL, incorporating error context from previous attempts."""
        system_msg = f"""You are a PostgreSQL expert. Generate a SQL query for
the user's question using this schema:

{self.schema}

Return ONLY the SQL query."""

        # Build error context from previous attempts
        if previous_attempts:
            error_context = "\n\nPrevious attempts that failed:\n"
            for i, attempt in enumerate(previous_attempts, 1):
                error_context += f"\nAttempt {i}:\n"
                error_context += f"SQL: {attempt.sql}\n"
                error_context += f"Error: {attempt.error}\n"
            error_context += "\nFix the issues and generate a corrected query."
            system_msg += error_context

        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system_msg},
                {"role": "user", "content": question},
            ],
            temperature=0,
        )
        return response.choices[0].message.content.strip()

    def _execute(self, sql: str) -> QueryAttempt:
        """Execute SQL and return the attempt result."""
        try:
            conn = psycopg2.connect(self.conn_string)
            cur = conn.cursor()
            cur.execute(sql)
            columns = [desc[0] for desc in cur.description]
            rows = [dict(zip(columns, row)) for row in cur.fetchall()]
            conn.close()
            return QueryAttempt(sql=sql, error=None, results=rows)
        except Exception as e:
            return QueryAttempt(sql=sql, error=str(e), results=None)

    def ask(self, question: str) -> dict:
        """Ask a question with automatic error correction."""
        attempts = []

        for retry in range(self.max_retries):
            sql = self._generate(question, attempts)
            attempt = self._execute(sql)
            attempts.append(attempt)

            if attempt.error is None:
                return {
                    "sql": sql,
                    "results": attempt.results,
                    "total_attempts": retry + 1,
                    "corrections": [a.error for a in attempts[:-1]],
                }

        # All retries exhausted
        return {
            "error": "Failed after all retry attempts",
            "attempts": [
                {"sql": a.sql, "error": a.error} for a in attempts
            ],
        }
```

## Detecting Logic Errors with Result Validation

Syntax and schema errors produce exceptions. Logic errors are harder because the query succeeds but returns wrong data. Use heuristic checks to catch common logic errors:

```python
class ResultValidator:
    """Detect potential logic errors in query results."""

    def validate(self, question: str, sql: str,
                 results: list[dict]) -> tuple[bool, str]:
        checks = [
            self._check_empty_results,
            self._check_unreasonable_counts,
            self._check_null_heavy_results,
            self._check_aggregation_mismatch,
        ]
        for check in checks:
            is_valid, reason = check(question, sql, results)
            if not is_valid:
                return False, reason
        return True, "Results look reasonable"

    def _check_empty_results(self, question: str, sql: str,
                             results: list[dict]) -> tuple[bool, str]:
        if not results:
            # Empty results might be correct, but flag for review
            return False, ("Query returned 0 rows. This may indicate an "
                          "overly restrictive WHERE clause or a wrong table.")
        return True, "OK"

    def _check_unreasonable_counts(self, question: str, sql: str,
                                    results: list[dict]) -> tuple[bool, str]:
        upper_q = question.upper()
        if any(word in upper_q for word in ["HOW MANY", "COUNT", "TOTAL"]):
            # For count questions, check if result is a single row
            if len(results) > 1 and "count" not in str(results[0].keys()).lower():
                return False, ("Question asks for a count but query returned "
                              "multiple rows without aggregation.")
        return True, "OK"

    def _check_null_heavy_results(self, question: str, sql: str,
                                   results: list[dict]) -> tuple[bool, str]:
        if results:
            null_count = sum(
                1 for row in results
                for v in row.values() if v is None
            )
            total_values = len(results) * len(results[0])
            if total_values > 0 and null_count / total_values > 0.5:
                return False, ("Over 50% of result values are NULL, "
                              "suggesting a wrong JOIN or missing data.")
        return True, "OK"

    def _check_aggregation_mismatch(self, question: str, sql: str,
                                     results: list[dict]) -> tuple[bool, str]:
        upper_q = question.upper()
        has_group_word = any(w in upper_q for w in ["EACH", "PER", "BY", "EVERY"])
        if has_group_word and len(results)  dict:
        attempts = []

        for retry in range(self.max_retries):
            sql = self._generate(question, attempts)
            attempt = self._execute(sql)

            if attempt.error:
                attempts.append(attempt)
                continue

            # Validate results for logic errors
            is_valid, reason = self.result_validator.validate(
                question, sql, attempt.results
            )
            if not is_valid:
                attempt.error = f"Result validation: {reason}"
                attempts.append(attempt)
                continue

            attempts.append(attempt)
            return {
                "sql": sql,
                "results": attempt.results,
                "total_attempts": retry + 1,
            }

        return {"error": "All attempts failed", "attempts": len(attempts)}
```

## FAQ

### How many retries should I allow?

Three retries is the sweet spot. Most recoverable errors are fixed on the second attempt when the LLM sees the error message. A third attempt handles edge cases where the second fix introduces a new issue. Beyond three retries, the success rate drops sharply — if the model cannot fix it in three tries, it likely needs human intervention or a clearer question.

### Does error correction increase latency significantly?

On average, 85% of queries succeed on the first attempt, so the typical user experiences no added latency. For the 15% that need correction, each retry adds 1-2 seconds (one LLM call plus one database query). The total worst-case for three retries is about 6 seconds, which is acceptable for analytical queries where users expect some processing time.

### Can I use the error correction history to improve the model over time?

Yes. Log all correction events — the original question, the failed SQL, the error message, and the corrected SQL. This creates a training dataset of common mistakes. You can use it for few-shot examples in the prompt or for fine-tuning to reduce first-attempt errors over time.

---

#ErrorCorrection #SelfHealing #TextToSQL #QueryRetry #AgenticAI #LLMDebugging #SQLFixing #AIReliability

---

Source: https://callsphere.ai/blog/text-to-sql-error-correction-self-healing-queries
