Plan-and-Execute Agents: Separating Planning from Execution for Complex Tasks

Why Separate Planning from Execution?

Most basic AI agents operate in a tight loop: observe, think, act, repeat. This works for simple tasks, but breaks down on complex multi-step problems. The agent gets lost in execution details and loses sight of the overall strategy.

Plan-and-execute agents solve this by introducing a clear separation of concerns. A planner agent creates a high-level plan, and an executor agent carries out each step. After each step, a replanner evaluates progress and adjusts the plan if needed.

This mirrors how experienced engineers work: you sketch out an architecture before writing code, and you revise the plan when you hit unexpected obstacles.

The Architecture

The system has three components:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

Planner — takes the original task and produces an ordered list of steps
Executor — takes a single step and executes it using tools or reasoning
Replanner — reviews completed steps and remaining steps, then decides whether to continue, modify, or replace the plan

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class Plan(BaseModel):
    steps: list[str]
    current_step: int = 0

class StepResult(BaseModel):
    step: str
    output: str
    success: bool

def create_plan(task: str) -> Plan:
    """Planner agent: decompose task into ordered steps."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a planning agent. Break the task into 3-7 "
                "concrete, sequential steps. Each step should be "
                "independently executable. Return a JSON list of steps."
            )},
            {"role": "user", "content": f"Task: {task}"},
        ],
        response_format={"type": "json_object"},
    )
    import json
    data = json.loads(response.choices[0].message.content)
    return Plan(steps=data["steps"])

The Executor

The executor focuses on a single step at a time, with access to tools and context from previous steps:

def execute_step(step: str, context: list[StepResult]) -> StepResult:
    """Executor agent: carry out a single step."""
    context_str = "\n".join(
        f"Step: {r.step} -> Result: {r.output}" for r in context
    )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are an execution agent. Complete the given step "
                "using the context from previous steps. Be precise "
                "and thorough."
            )},
            {"role": "user", "content": (
                f"Previous results:\n{context_str}\n\n"
                f"Current step to execute: {step}"
            )},
        ],
    )
    output = response.choices[0].message.content
    return StepResult(step=step, output=output, success=True)

Replanning on Failure

The real power of this architecture emerges when things go wrong. Instead of blindly continuing, the replanner can adapt:

def replan_if_needed(
    original_task: str,
    plan: Plan,
    results: list[StepResult],
) -> Plan:
    """Replanner: assess progress and adjust the plan."""
    completed = results[-1] if results else None

    if completed and not completed.success:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": (
                    "You are a replanning agent. The last step failed. "
                    "Analyze why and create a revised plan for the "
                    "remaining work. You may add, remove, or reorder steps."
                )},
                {"role": "user", "content": (
                    f"Original task: {original_task}\n"
                    f"Failed step: {completed.step}\n"
                    f"Error: {completed.output}\n"
                    f"Remaining steps: {plan.steps[plan.current_step:]}"
                )},
            ],
            response_format={"type": "json_object"},
        )
        import json
        data = json.loads(response.choices[0].message.content)
        return Plan(steps=data["steps"])

    return plan  # no replanning needed

The Orchestration Loop

Tying it all together:

def plan_and_execute(task: str, max_replans: int = 3) -> list[StepResult]:
    plan = create_plan(task)
    results: list[StepResult] = []
    replans = 0

    while plan.current_step < len(plan.steps):
        step = plan.steps[plan.current_step]
        print(f"Executing step {plan.current_step + 1}: {step}")

        result = execute_step(step, results)
        results.append(result)

        if not result.success and replans < max_replans:
            plan = replan_if_needed(task, plan, results)
            replans += 1
            continue

        plan.current_step += 1

    return results

When to Use Plan-and-Execute

This architecture shines for tasks like research reports (plan sections, write each, revise), data pipelines (plan transforms, execute sequentially), and code generation (plan modules, implement each). It adds overhead for simple tasks, so use a standard ReAct agent when the task requires fewer than three steps.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

FAQ

How granular should the plan steps be?

Each step should be completable in a single LLM call with tool access. If a step requires sub-planning, it is too coarse. Aim for 3-7 steps for most tasks. The planner can always decompose further during replanning.

How does this compare to ReAct agents?

ReAct interleaves reasoning and action in a single loop. Plan-and-execute separates them explicitly. ReAct is better for exploratory tasks where the path is unclear. Plan-and-execute is better for structured tasks where you can outline the approach upfront.

What happens if replanning keeps failing?

Set a max_replans limit (typically 2-3). If the agent exhausts its replans, return partial results with a clear failure report. In production, this should trigger a human-in-the-loop escalation.

#PlanAndExecute #AgentArchitecture #TaskPlanning #Replanning #AgenticAI #LangGraph #PythonAI #AIEngineering

Plan-and-Execute Agents: Separating Planning from Execution for Complex Tasks

Why Separate Planning from Execution?

The Architecture

The Executor

Replanning on Failure

The Orchestration Loop

When to Use Plan-and-Execute

FAQ

How granular should the plan steps be?

How does this compare to ReAct agents?

What happens if replanning keeps failing?

Try CallSphere AI Voice Agents

Related Articles You May Like

Graphiti: How Temporal Knowledge Graphs Give AI Voice Agents Persistent Memory (2026 Guide)

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

Anatomy of an AI Pitchbook Builder Powered by Claude Opus 4.7

ReAct Loop vs Model-Native: Head-to-Head on Reliability and Cost

The Agent Control Loop Is Moving Inside the Model: Old vs New Diagram

MCP vs A2A: When To Use Which Protocol (2026 Decision Guide)