Text-to-SQL with Claude: Using Anthropic's API for Database Question Answering

Why Claude for Text-to-SQL?

Claude excels at text-to-SQL for several reasons: its large context window handles massive schemas without truncation, its tool use feature enables agentic SQL execution loops, and its instruction-following precision reduces the need for complex prompt engineering. Claude also tends to be conservative — it asks for clarification rather than guessing, which is exactly what you want when generating database queries.

Setting Up the Anthropic Client

import anthropic
import sqlite3
from typing import Any

client = anthropic.Anthropic()  # Uses ANTHROPIC_API_KEY env var

def get_schema(db_path: str) -> str:
    conn = sqlite3.connect(db_path)
    cursor = conn.execute(
        "SELECT sql FROM sqlite_master WHERE type='table' AND sql IS NOT NULL"
    )
    schema = "\n\n".join(row[0] for row in cursor.fetchall())
    conn.close()
    return schema

def execute_query(db_path: str, sql: str) -> list[dict]:
    conn = sqlite3.connect(db_path)
    conn.row_factory = sqlite3.Row
    cursor = conn.execute(sql)
    results = [dict(row) for row in cursor.fetchall()]
    conn.close()
    return results

Using Claude's Tool Use for SQL Execution

Claude's tool use feature lets the model decide when to execute a SQL query and inspect the results. This is more powerful than a single-shot prompt because Claude can run a query, check the output, and refine it.

flowchart TD
    USER(["User message"])
    LLM["LLM call<br/>with tools schema"]
    DECIDE{"Model wants<br/>to call a tool?"}
    EXEC["Execute tool<br/>sandboxed runtime"]
    RESULT["Append tool_result<br/>to messages"]
    GUARD{"Output passes<br/>guardrails?"}
    DONE(["Final reply"])
    BLOCK(["Refuse and log"])
    USER --> LLM --> DECIDE
    DECIDE -->|Yes| EXEC --> RESULT --> LLM
    DECIDE -->|No| GUARD
    GUARD -->|Yes| DONE
    GUARD -->|No| BLOCK
    style LLM fill:#4f46e5,stroke:#4338ca,color:#fff
    style EXEC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style DONE fill:#059669,stroke:#047857,color:#fff
    style BLOCK fill:#dc2626,stroke:#b91c1c,color:#fff

tools = [
    {
        "name": "execute_sql",
        "description": "Execute a SQL query against the database and return results. "
                       "Use this to answer questions about the data.",
        "input_schema": {
            "type": "object",
            "properties": {
                "sql": {
                    "type": "string",
                    "description": "The SQL query to execute. Must be a SELECT statement.",
                },
                "reasoning": {
                    "type": "string",
                    "description": "Brief explanation of why this query answers the question.",
                },
            },
            "required": ["sql", "reasoning"],
        },
    }
]

SYSTEM_PROMPT = """You are a database analyst. You answer questions about data
by writing and executing SQL queries.

Database schema:
{schema}

Rules:
- Only use SELECT statements
- Always include LIMIT 100 unless the user asks for all results
- Use the execute_sql tool to run queries
- After seeing results, provide a natural language summary
- If a query fails, analyze the error and try a corrected version"""

The Multi-Turn Conversation Loop

The key pattern with Claude's tool use is a message loop: send the question, check if Claude wants to use a tool, execute the tool, feed the result back, and repeat until Claude provides a final text response.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

def ask_database(question: str, db_path: str) -> str:
    schema = get_schema(db_path)
    messages = [{"role": "user", "content": question}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            system=SYSTEM_PROMPT.format(schema=schema),
            tools=tools,
            messages=messages,
        )

        # Check if Claude wants to use a tool
        if response.stop_reason == "tool_use":
            # Extract tool use block
            tool_block = next(
                b for b in response.content if b.type == "tool_use"
            )
            sql = tool_block.input["sql"]

            # Execute the query safely
            try:
                results = execute_query(db_path, sql)
                tool_result = {
                    "type": "tool_result",
                    "tool_use_id": tool_block.id,
                    "content": str(results[:50]),  # Cap result size
                }
            except Exception as e:
                tool_result = {
                    "type": "tool_result",
                    "tool_use_id": tool_block.id,
                    "content": f"Error: {str(e)}",
                    "is_error": True,
                }

            # Add assistant response and tool result to conversation
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": [tool_result]})
        else:
            # Claude provided a final text answer
            text_block = next(
                b for b in response.content if hasattr(b, "text")
            )
            return text_block.text

Handling Multi-Step Questions

Claude naturally handles questions that require multiple queries. When asked "Compare this quarter's sales to the same quarter last year," Claude will execute two queries — one for each period — then synthesize the results into a comparison.

# Claude will automatically execute multiple queries
answer = ask_database(
    "Compare the average order value in January vs February, "
    "and tell me which product category drove the difference.",
    "sales.db"
)
print(answer)

Behind the scenes, Claude might execute three tool calls: one for January averages, one for February averages, and a third breaking down by product category. The conversation loop handles this seamlessly.

Adding Guardrails

Before executing any query, validate it:

import sqlparse

def validate_query(sql: str) -> tuple[bool, str]:
    """Ensure the query is safe to execute."""
    parsed = sqlparse.parse(sql)
    if not parsed:
        return False, "Empty query"

    statement_type = parsed[0].get_type()
    if statement_type != "SELECT":
        return False, f"Only SELECT allowed, got {statement_type}"

    dangerous_keywords = ["DROP", "DELETE", "UPDATE", "INSERT", "ALTER", "TRUNCATE"]
    upper_sql = sql.upper()
    for keyword in dangerous_keywords:
        if keyword in upper_sql:
            return False, f"Forbidden keyword: {keyword}"

    return True, "OK"

Integrate this into the tool execution step so no unsafe query ever reaches the database.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

FAQ

Does Claude handle complex JOINs as well as GPT-4?

Yes. Claude Sonnet and Opus both handle multi-table JOINs, subqueries, and CTEs with high accuracy. Claude tends to write more explicit JOIN conditions and is less likely to produce ambiguous column references, which reduces runtime errors in practice.

How does tool use differ from just asking Claude to output SQL?

With tool use, Claude can execute the query, see the results, and self-correct. If a query returns zero rows or an error, Claude can diagnose the problem and try a different approach. Single-shot SQL generation has no feedback loop, so errors are returned directly to the user.

What is the cost of running text-to-SQL with Claude?

A typical question requires 1-3 API calls. With Claude Sonnet, each call costs roughly $0.003-0.01 depending on the schema size and response length. For most applications, the cost per question is under $0.03, making it highly cost-effective compared to building custom NLU pipelines.

#Claude #Anthropic #TextToSQL #ToolUse #DatabaseAgent #AgenticAI #NaturalLanguageSQL #PythonAI

Text-to-SQL with Claude: Using Anthropic's API for Database Question Answering

Why Claude for Text-to-SQL?

Setting Up the Anthropic Client

Using Claude's Tool Use for SQL Execution

The Multi-Turn Conversation Loop

Handling Multi-Step Questions

Adding Guardrails

FAQ

Does Claude handle complex JOINs as well as GPT-4?

How does tool use differ from just asking Claude to output SQL?

What is the cost of running text-to-SQL with Claude?

Try CallSphere AI Voice Agents

Related Articles You May Like

How to Use Multiple Chat AIs at Once (and Why You Might)

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

GPT-Realtime-2 Tool Use and Reasoning: GPT-5-Class Voice Agents

Gemini Enterprise vs Anthropic vs OpenAI Frontier: 2026 Comparison

Anthropic's Financial Services Platform: State of Play in May 2026

Project Arc vs Anthropic Managed Agents: Enterprise Agent Comparison