Skip to content
Learn Agentic AI
Learn Agentic AI11 min read2 views

Tool Use in AI Agents: Extending LLM Capabilities with External Functions

Master the design and implementation of tools for AI agents — why tools matter, how to write effective tool descriptions, execution flow, error handling, and best practices for production tool systems.

Why Tools Are the Bridge Between Thinking and Doing

An LLM without tools is a brain without hands. It can reason, analyze, and generate text — but it cannot check the weather, query a database, send an email, or read a file. Tools are what turn a language model from a conversationalist into an agent that can affect the real world.

Tool use (also called function calling) is the mechanism by which an LLM requests the execution of an external function. The model does not run the function itself — it generates a structured request (function name + arguments), your code executes it, and the result is fed back into the model's context.

The Tool Execution Flow

Understanding the exact flow of a tool call is essential for debugging and designing reliable agents.

flowchart TD
    START["Tool Use in AI Agents: Extending LLM Capabilities…"] --> A
    A["Why Tools Are the Bridge Between Thinki…"]
    A --> B
    B["The Tool Execution Flow"]
    B --> C
    C["Designing Effective Tools"]
    C --> D
    D["Building a Tool Registry"]
    D --> E
    E["Error Handling in Tool Execution"]
    E --> F
    F["Tool Permissions and Safety"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
1. LLM receives messages + tool definitions
2. LLM decides to call a tool (instead of responding with text)
3. LLM outputs: {"tool": "search_db", "args": {"query": "overdue invoices"}}
4. Your code intercepts this, executes search_db(query="overdue invoices")
5. Your code appends the result as a tool message
6. LLM receives the result and decides what to do next
7. Repeat until LLM responds with text (no tool call)

The critical insight is that the LLM never executes anything. It only generates the intent to use a tool. Your application code is the executor, which means you have full control over permissions, validation, and error handling.

Designing Effective Tools

Tool quality directly determines agent quality. A poorly designed tool confuses the LLM and leads to wrong arguments, unnecessary calls, or missed opportunities to use the right tool.

Good Tool Design Principles

# GOOD: Clear name, specific description, well-typed parameters
{
    "type": "function",
    "function": {
        "name": "search_invoices",
        "description": (
            "Search for invoices by status, client name, or date range. "
            "Returns up to 20 matching invoices with amount, status, and due date."
        ),
        "parameters": {
            "type": "object",
            "properties": {
                "status": {
                    "type": "string",
                    "enum": ["paid", "overdue", "pending", "cancelled"],
                    "description": "Filter by invoice status",
                },
                "client_name": {
                    "type": "string",
                    "description": "Partial or full client name to search for",
                },
                "due_before": {
                    "type": "string",
                    "description": "ISO date string. Return invoices due before this date.",
                },
            },
        },
    },
}

# BAD: Vague name, no description, untyped parameters
{
    "type": "function",
    "function": {
        "name": "search",
        "description": "Search for stuff",
        "parameters": {
            "type": "object",
            "properties": {
                "q": {"type": "string"},
            },
        },
    },
}

The description is the most important field. The LLM reads it to decide when and how to use the tool. Write descriptions as if you were explaining the tool to a new team member — be specific about what it does, what it returns, and any limitations.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Building a Tool Registry

In production, you need a systematic way to register, discover, and execute tools. Here is a clean pattern:

flowchart TD
    ROOT["Tool Use in AI Agents: Extending LLM Capabil…"] 
    ROOT --> P0["Designing Effective Tools"]
    P0 --> P0C0["Good Tool Design Principles"]
    ROOT --> P1["FAQ"]
    P1 --> P1C0["How many tools should an agent have acc…"]
    P1 --> P1C1["Should tool descriptions include exampl…"]
    P1 --> P1C2["How do I test tools independently from …"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
from typing import Callable, Any
import json
import inspect

class ToolRegistry:
    def __init__(self):
        self._tools: dict[str, dict] = {}
        self._executors: dict[str, Callable] = {}

    def register(self, func: Callable, description: str, parameters: dict):
        name = func.__name__
        self._tools[name] = {
            "type": "function",
            "function": {
                "name": name,
                "description": description,
                "parameters": parameters,
            },
        }
        self._executors[name] = func

    def get_tool_definitions(self) -> list[dict]:
        return list(self._tools.values())

    def execute(self, name: str, arguments: dict) -> Any:
        if name not in self._executors:
            return {"error": f"Unknown tool: {name}"}
        try:
            return self._executors[name](**arguments)
        except Exception as e:
            return {"error": f"Tool execution failed: {str(e)}"}

# Usage
registry = ToolRegistry()

def get_weather(city: str, units: str = "celsius") -> dict:
    # In production, call a real weather API
    return {"city": city, "temperature": 22, "units": units, "condition": "sunny"}

registry.register(
    func=get_weather,
    description="Get current weather for a city. Returns temperature and conditions.",
    parameters={
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "City name"},
            "units": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature units (default: celsius)",
            },
        },
        "required": ["city"],
    },
)

Error Handling in Tool Execution

Tools fail. APIs time out, databases go down, users pass invalid arguments. How you handle tool errors determines whether your agent recovers gracefully or spirals into confusion.

def safe_execute_tool(registry: ToolRegistry, name: str, raw_args: str) -> str:
    """Execute a tool with comprehensive error handling."""
    # Parse arguments
    try:
        arguments = json.loads(raw_args)
    except json.JSONDecodeError as e:
        return json.dumps({
            "error": "Invalid arguments format",
            "details": str(e),
            "suggestion": "Please provide valid JSON arguments",
        })

    # Execute with timeout protection
    try:
        result = registry.execute(name, arguments)
        return json.dumps(result, default=str)
    except TimeoutError:
        return json.dumps({
            "error": f"Tool '{name}' timed out",
            "suggestion": "Try again with a simpler query or different parameters",
        })
    except Exception as e:
        return json.dumps({
            "error": f"Tool '{name}' failed: {str(e)}",
            "suggestion": "Check the arguments and try again",
        })

The key insight is to always return structured error messages to the LLM, not raw exceptions. Include a suggestion field — it guides the LLM toward recovery instead of just repeating the same failing call.

Tool Permissions and Safety

Not all tools should be available to all agents. A customer-facing agent should not have access to delete_database. Implement tool-level permissions:

class PermissionedToolRegistry(ToolRegistry):
    def __init__(self):
        super().__init__()
        self._permissions: dict[str, str] = {}  # tool_name -> permission level

    def register(self, func, description, parameters, permission="read"):
        super().register(func, description, parameters)
        self._permissions[func.__name__] = permission

    def get_tools_for_level(self, level: str) -> list[dict]:
        levels = {"read": 0, "write": 1, "admin": 2}
        max_level = levels.get(level, 0)
        return [
            self._tools[name]
            for name, perm in self._permissions.items()
            if levels.get(perm, 0) <= max_level
        ]

FAQ

How many tools should an agent have access to?

Keep it under 20 for most agents. Research shows that LLM tool selection accuracy degrades as the number of available tools increases. If you need more, use a router pattern — a first LLM call selects the relevant tool category, then a second call picks the specific tool from a smaller set.

Should tool descriptions include examples?

Yes, especially for tools with complex parameters. Including a brief example in the description (like "Example: search_invoices(status='overdue', client_name='Acme')") significantly improves the LLM's ability to construct correct arguments.

How do I test tools independently from the agent?

Write unit tests for each tool function that verify correct outputs for valid inputs and proper error handling for invalid inputs. Then write integration tests that run the full agent loop with mock tool responses to verify the agent calls tools correctly. Test tools in isolation before testing them within the agent.


#ToolUse #FunctionCalling #AIAgents #Python #APIDesign #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Automating Client Document Collection: How AI Agents Chase Missing Tax Documents and Reduce Filing Delays

See how AI agents automate tax document collection — chasing missing W-2s, 1099s, and receipts via calls and texts to eliminate the #1 CPA bottleneck.

Technical Guides

Building Voice Agents with the OpenAI Realtime API: Full Tutorial

Hands-on tutorial for building voice agents with the OpenAI Realtime API — WebSocket setup, PCM16 audio, server VAD, and function calling.

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

API Design for AI Agent Tool Functions: Best Practices and Anti-Patterns

How to design tool functions that LLMs can use effectively with clear naming, enum parameters, structured responses, informative error messages, and documentation.

Learn Agentic AI

Computer Use in GPT-5.4: Building AI Agents That Navigate Desktop Applications

Technical guide to GPT-5.4's computer use capabilities for building AI agents that interact with desktop UIs, browser automation, and real-world application workflows.

Learn Agentic AI

AI Agents for IT Helpdesk: L1 Automation, Ticket Routing, and Knowledge Base Integration

Build IT helpdesk AI agents with multi-agent architecture for triage, device, network, and security issues. RAG-powered knowledge base, automated ticket creation, routing, and escalation.