---
title: "Returning Rich Output from Agent Tools: Images, Files, and Structured Data"
description: "Go beyond plain text responses. Learn how to return images, files, and structured data from OpenAI Agents SDK tools using ToolOutputImage, ToolOutputFileContent, and ToolOutputText."
canonical: https://callsphere.ai/blog/returning-rich-output-agent-tools-images-files-structured-data
category: "Learn Agentic AI"
tags: ["OpenAI", "Tools", "Rich Output", "Images", "Files"]
author: "CallSphere Team"
published: 2026-03-14T00:00:00.000Z
updated: 2026-05-06T13:34:30.715Z
---

# Returning Rich Output from Agent Tools: Images, Files, and Structured Data

> Go beyond plain text responses. Learn how to return images, files, and structured data from OpenAI Agents SDK tools using ToolOutputImage, ToolOutputFileContent, and ToolOutputText.

## Beyond Plain Text Tool Outputs

Most tool examples return simple strings. But real-world agents need to return charts, generated files, images, and structured data. The OpenAI Agents SDK provides dedicated output types that let your tools return rich content alongside text.

The three output types are:

- **ToolOutputText** — explicit text output (useful when combining with other types)
- **ToolOutputImage** — base64-encoded images that the model can see and reason about
- **ToolOutputFileContent** — file content (CSV, PDF, etc.) as base64 data

## Returning Images with ToolOutputImage

When your tool generates a chart, screenshot, or any visual content, wrap it in a `ToolOutputImage` so the agent can interpret the image:

```mermaid
flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus
classify"]
    PLAN["Plan and tool
selection"]
    AGENT["Agent loop
LLM plus tools"]
    GUARD{"Guardrails
and policy"}
    EXEC["Execute and
verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus
next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
```

```python
import base64
from agents import function_tool
from agents.tool import ToolOutputImage

@function_tool
def generate_chart(data_points: str) -> ToolOutputImage:
    """Generate a bar chart from comma-separated values."""
    import matplotlib
    matplotlib.use("Agg")
    import matplotlib.pyplot as plt
    import io

    values = [float(x.strip()) for x in data_points.split(",")]
    labels = [f"Item {i+1}" for i in range(len(values))]

    fig, ax = plt.subplots()
    ax.bar(labels, values)
    ax.set_title("Generated Chart")

    buf = io.BytesIO()
    fig.savefig(buf, format="png")
    buf.seek(0)
    plt.close(fig)

    image_base64 = base64.b64encode(buf.read()).decode("utf-8")
    return ToolOutputImage(image_data=image_base64, media_type="image/png")
```

The agent receives the image and can describe it, answer questions about it, or reference it in its response. This is particularly useful for data visualization, diagram generation, and screenshot analysis tools.

## Returning Files with ToolOutputFileContent

For tools that produce downloadable content — CSV exports, generated PDFs, configuration files — use `ToolOutputFileContent`:

```python
import base64
import csv
import io
from agents import function_tool
from agents.tool import ToolOutputFileContent

@function_tool
def export_report_csv(report_type: str) -> ToolOutputFileContent:
    """Export a report as a CSV file."""
    output = io.StringIO()
    writer = csv.writer(output)
    writer.writerow(["Month", "Revenue", "Expenses", "Profit"])
    writer.writerow(["January", "50000", "30000", "20000"])
    writer.writerow(["February", "55000", "32000", "23000"])
    writer.writerow(["March", "60000", "31000", "29000"])

    csv_bytes = output.getvalue().encode("utf-8")
    file_base64 = base64.b64encode(csv_bytes).decode("utf-8")

    return ToolOutputFileContent(
        file_data=file_base64,
        media_type="text/csv",
    )
```

The model receives the file content and can summarize it, extract specific values, or explain what the file contains.

## Combining Multiple Output Types

A single tool can return multiple outputs by returning a list. For example, a reporting tool might return both a chart image and the underlying data as text:

```python
import base64
import io
from agents import function_tool
from agents.tool import ToolOutputImage, ToolOutputText

@function_tool
def sales_dashboard(quarter: str) -> list:
    """Generate a sales dashboard with chart and summary for the given quarter."""
    import matplotlib
    matplotlib.use("Agg")
    import matplotlib.pyplot as plt

    months = ["Month 1", "Month 2", "Month 3"]
    revenue = [45000, 52000, 61000]

    fig, ax = plt.subplots()
    ax.plot(months, revenue, marker="o")
    ax.set_title(f"Revenue Trend - {quarter}")
    ax.set_ylabel("Revenue ($)")

    buf = io.BytesIO()
    fig.savefig(buf, format="png")
    buf.seek(0)
    plt.close(fig)

    image_data = base64.b64encode(buf.read()).decode("utf-8")
    total = sum(revenue)

    return [
        ToolOutputImage(image_data=image_data, media_type="image/png"),
        ToolOutputText(text=f"Total revenue for {quarter}: ${total:,}. Growth rate: 35.6% from Month 1 to Month 3."),
    ]
```

When you return a list, the SDK sends each item as a separate content block in the tool response. The agent sees all of them and can reference both the chart and the text in its reply.

## Using ToolOutputText for Explicit Text

You might wonder why `ToolOutputText` exists when you can just return a string. The answer is composition — when you return a list of outputs, you need explicit types for each element:

```python
from agents.tool import ToolOutputText, ToolOutputImage

@function_tool
def analyze_image(image_url: str) -> list:
    """Download an image, analyze it, and return both the image and analysis."""
    # Download and process the image...
    image_base64 = "..."  # base64 encoded image

    return [
        ToolOutputImage(image_data=image_base64, media_type="image/jpeg"),
        ToolOutputText(text="Analysis: The image contains a landscape with mountains and a lake. Dominant colors are blue and green."),
    ]
```

## Structured Data as Tool Output

For tools that return structured data (JSON, tables, records), you have two options. The simplest is to format the data as a readable string:

```python
import json
from agents import function_tool

@function_tool
def get_customer_profile(customer_id: str) -> str:
    """Retrieve a customer's full profile."""
    profile = {
        "id": customer_id,
        "name": "Jane Smith",
        "plan": "Enterprise",
        "usage": {"api_calls": 15420, "storage_gb": 42.3},
        "status": "active",
    }
    return json.dumps(profile, indent=2)
```

The agent parses JSON naturally and can extract specific fields when answering user questions. For very large or complex data, consider summarizing it in the tool before returning.

## Key Takeaways

- Use `ToolOutputImage` to return charts, screenshots, and generated visuals
- Use `ToolOutputFileContent` for downloadable files like CSVs and PDFs
- Return a list to combine multiple output types from a single tool call
- Use `ToolOutputText` when mixing text with other output types in a list
- For structured data, JSON strings work well — the model parses them naturally
- Always base64-encode binary content before wrapping in output types

---

Source: https://callsphere.ai/blog/returning-rich-output-agent-tools-images-files-structured-data