---
title: "Building a Data Analysis Agent: Natural Language to SQL and Visualizations"
description: "Learn how to build an AI agent that converts natural language questions into SQL queries, executes them against a database, generates charts from the results, and provides plain-English interpretations of the data."
canonical: https://callsphere.ai/blog/building-data-analysis-agent-natural-language-sql-visualizations
category: "Learn Agentic AI"
tags: ["Data Analysis", "Text-to-SQL", "Visualization", "Python", "AI Agents"]
author: "CallSphere Team"
published: 2026-03-17T00:00:00.000Z
updated: 2026-05-06T01:02:43.442Z
---

# Building a Data Analysis Agent: Natural Language to SQL and Visualizations

> Learn how to build an AI agent that converts natural language questions into SQL queries, executes them against a database, generates charts from the results, and provides plain-English interpretations of the data.

## Why Data Analysis Needs an Agent Layer

Most organizations store critical business data in SQL databases, but only a fraction of employees know how to write SQL. A data analysis agent bridges this gap by accepting natural language questions like "What were our top 5 products by revenue last quarter?" and returning both the answer and a visualization — no SQL knowledge required.

This is not just a text-to-SQL translator. A proper data analysis agent forms a loop: it understands the schema, generates a query, executes it, checks for errors, builds a chart if appropriate, and explains the results in plain language. Each of these steps requires a different tool, and the agent orchestrates them autonomously.

## Architecture Overview

The agent needs four core tools:

```mermaid
flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus
classify"]
    PLAN["Plan and tool
selection"]
    AGENT["Agent loop
LLM plus tools"]
    GUARD{"Guardrails
and policy"}
    EXEC["Execute and
verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus
next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
```

1. **Schema inspector** — retrieves table names, columns, and types from the database
2. **SQL executor** — runs a generated query and returns rows
3. **Chart generator** — creates visualizations from tabular results
4. **Interpreter** — produces a natural language summary of the data

Here is the foundational setup:

```python
import sqlite3
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import io
import base64
from agents import Agent, Runner, function_tool

DB_PATH = "sales.db"

def get_connection():
    conn = sqlite3.connect(DB_PATH)
    conn.row_factory = sqlite3.Row
    return conn
```

## Tool 1: Schema Inspector

The agent must understand the database structure before writing SQL. This tool returns every table with its columns and types:

```python
@function_tool
def inspect_schema() -> str:
    """Return all table names, column names, and column types in the database."""
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
    tables = [row["name"] for row in cursor.fetchall()]

    schema_parts = []
    for table in tables:
        cursor.execute(f"PRAGMA table_info({table})")
        cols = cursor.fetchall()
        col_defs = [f"  {c['name']} ({c['type']})" for c in cols]
        schema_parts.append(f"Table: {table}\n" + "\n".join(col_defs))

    conn.close()
    return "\n\n".join(schema_parts)
```

## Tool 2: SQL Executor with Safety Checks

Never let an LLM run arbitrary write operations on your database. The executor validates that the query is read-only before running it:

```python
@function_tool
def execute_sql(query: str) -> str:
    """Execute a read-only SQL query and return up to 50 rows as text."""
    normalized = query.strip().upper()
    if not normalized.startswith("SELECT"):
        return "Error: Only SELECT queries are allowed."

    conn = get_connection()
    try:
        cursor = conn.cursor()
        cursor.execute(query)
        rows = cursor.fetchmany(50)
        if not rows:
            return "Query returned 0 rows."
        headers = rows[0].keys()
        lines = ["\t".join(headers)]
        for row in rows:
            lines.append("\t".join(str(row[h]) for h in headers))
        return "\n".join(lines)
    except Exception as e:
        return f"SQL Error: {e}"
    finally:
        conn.close()
```

## Tool 3: Chart Generator

When the data suits a visual representation, the agent can produce a bar chart, line chart, or pie chart:

```python
@function_tool
def generate_chart(
    chart_type: str, labels: list[str], values: list[float], title: str
) -> str:
    """Create a chart and return it as a base64-encoded PNG image.
    chart_type must be one of: bar, line, pie."""
    fig, ax = plt.subplots(figsize=(8, 5))
    if chart_type == "bar":
        ax.bar(labels, values)
    elif chart_type == "line":
        ax.plot(labels, values, marker="o")
    elif chart_type == "pie":
        ax.pie(values, labels=labels, autopct="%1.1f%%")
    else:
        return "Error: Unsupported chart type."

    ax.set_title(title)
    fig.tight_layout()

    buf = io.BytesIO()
    fig.savefig(buf, format="png")
    plt.close(fig)
    buf.seek(0)
    return base64.b64encode(buf.read()).decode()
```

## Assembling the Agent

Wire the tools together and give the agent clear instructions about its workflow:

```python
data_analyst = Agent(
    name="Data Analyst",
    instructions="""You are a data analysis agent. When the user asks a question:
1. Call inspect_schema to understand the database structure.
2. Write a SQL query to answer the question. Use execute_sql to run it.
3. If the query fails, read the error and fix the SQL.
4. If the results suit a chart, call generate_chart.
5. Always end with a plain-English interpretation of the findings.""",
    tools=[inspect_schema, execute_sql, generate_chart],
)

result = Runner.run_sync(
    data_analyst, "Which product category had the highest revenue last month?"
)
print(result.final_output)
```

The agent loop handles the rest. It inspects the schema, discovers the `orders` and `products` tables, writes a JOIN with a date filter, executes the query, generates a bar chart, and summarizes: "Electronics led with $42,300 in revenue last month, followed by Apparel at $31,800."

## Production Hardening Tips

**Query cost limits.** Wrap the executor with a timeout and a row cap so a poorly written query cannot lock the database or return millions of rows.

**Schema caching.** Call `inspect_schema` once per session and inject the result into the agent instructions rather than calling the tool on every question.

**Parameterized queries.** For databases with user-supplied filter values, extend the executor to accept parameters and use parameterized queries to prevent SQL injection.

## FAQ

### Can this agent handle JOINs across multiple tables?

Yes. By providing the full schema — including foreign key relationships — the LLM reliably generates multi-table JOINs. Include sample rows in the schema description if the column names are ambiguous.

### How do I prevent the agent from running destructive queries?

The executor shown above rejects any query that does not start with SELECT. For stronger guarantees, connect with a read-only database user that has no INSERT, UPDATE, or DELETE privileges at the database level.

### What if the generated SQL is incorrect?

The agent loop naturally handles this. When `execute_sql` returns an error message, the LLM reads it, identifies the issue (wrong column name, missing GROUP BY), and generates a corrected query on the next iteration.

---

#DataAnalysis #TexttoSQL #Visualization #Python #AIAgents #AgenticAI #LearnAI #AIEngineering

---

Source: https://callsphere.ai/blog/building-data-analysis-agent-natural-language-sql-visualizations
