---
title: "Migrating a Workflow to Claude Code Agents Without Breaking It"
description: "A hackathon-tested playbook for moving an existing workflow onto Claude Code agents — shadow runs, canary cutover, rollback, and parity evals."
canonical: https://callsphere.ai/blog/migrating-a-workflow-to-claude-code-agents-without-breaking-it
category: "Agentic AI"
tags: ["agentic ai", "claude", "claude code", "migration", "rollout", "shadow mode", "opus 4.8"]
author: "CallSphere Team"
published: 2026-04-20T12:32:44.000Z
updated: 2026-06-06T21:47:43.357Z
---

# Migrating a Workflow to Claude Code Agents Without Breaking It

> A hackathon-tested playbook for moving an existing workflow onto Claude Code agents — shadow runs, canary cutover, rollback, and parity evals.

The most ambitious project at the Built-with-Opus hackathon was not building an agent from scratch — it was replacing a workflow that already worked. A team took a creaky, partly-manual release process and rebuilt it as a Claude Code agent. The temptation was to flip the switch and bask in the speedup. The discipline was to not do that. They ran the new agent in the shadows for hours, compared its decisions against the old process, and only handed it the keys once it had proven it would not break the thing it replaced. That restraint is the whole subject of this post.

Migrating an existing workflow onto agents is a different problem from greenfield agent building. You already have a baseline that works, users who depend on it, and edge cases that the old system handles by accident. The goal is not just "build a good agent" — it is "replace the old thing without anyone getting hurt." That changes the strategy entirely, and the hackathon taught a clear sequence for doing it safely.

## Map the workflow before you automate it

The first mistake teams made was automating a workflow they did not fully understand. A process that has run for a year accumulates implicit rules — the human who did it knew to skip step three on Fridays, or to double-check a value that is usually fine. If you hand the literal happy path to an agent, those unwritten rules vanish and the edge cases bite.

So step one is to write the workflow down completely: every input, every decision point, every external system it touches, and crucially every exception the current owner handles by instinct. The team that succeeded interviewed the person who ran the old release process and captured the "oh, and sometimes you have to..." cases that never made it into any document. Those cases became both the agent's instructions and, later, its eval set. You cannot safely replace what you cannot fully describe.

## Shadow mode: run the agent without letting it act

The single most valuable migration technique was the shadow run. You deploy the agent alongside the existing workflow, feed it the same real inputs, and let it produce its decisions — but you do not let it act on anything. The old process stays in charge. Then you compare: where did the agent agree with the human, and where did it diverge?

```mermaid
flowchart TD
  A["Existing workflow in production"] --> B["Agent runs in shadow on same inputs"]
  B --> C{"Agent output matches baseline?"}
  C -->|No| D["Investigate divergence, fix or document"]
  D --> B
  C -->|Yes, consistently| E["Canary: agent handles small % of real traffic"]
  E --> F{"Metrics & errors healthy?"}
  F -->|No| G["Roll back to old workflow"]
  F -->|Yes| H["Ramp up, retire old path"]
```

The diagram is the safe-migration ladder the team climbed. Shadow runs surface divergences while they are harmless. Every disagreement between the agent and the baseline was a gift: either the agent was wrong and needed fixing, or — surprisingly often — the agent was right and exposed a flaw in the old process. You do not advance up the ladder until shadow output matches the baseline consistently across real inputs, including the weird ones. Only then does the agent touch real traffic, and even then only a sliver of it.

## Incremental cutover, never a big bang

The cardinal rule of safe migration is to never replace everything at once. After shadow mode proved parity, the team moved to a canary: the agent took over a small percentage of real cases while the old workflow handled the rest. They watched error rates, latency, and outcome quality on the canary slice. If anything looked wrong, the blast radius was tiny and the rollback was instant.

From there it is a ramp, not a leap — grow the agent's share as confidence grows, ten percent, then half, then most, retiring the old path only when the new one has earned it across the full range of real cases. Each step is reversible. The teams that tried to skip straight to a full cutover invariably hit an edge case they had not seen in testing, and without a graceful fallback, an edge case becomes an outage. Incremental cutover turns those same edge cases into minor, contained learning moments.

## Keep a rollback that actually works

A rollback plan you have never tested is a hope, not a plan. The migration team's rule was that the old workflow stayed fully operational and one switch away for the entire ramp. Cutting over to an agent is not a demolition; it is keeping the old bridge standing until the new one carries full traffic safely. They practiced the rollback before they needed it, so that when a canary metric dipped, flipping back took seconds and nobody had to improvise under pressure.

This matters more for agents than for ordinary software because agent behavior can shift in subtle ways — a model update, a changed tool response, an unexpected input can move an agent that was fine yesterday. A tested, instant rollback means those surprises cost you minutes, not a customer-facing incident. Keep the old path warm until the agent has handled a full cycle of real-world variety without it.

## Prove parity with the eval set you built

The exception cases you captured while mapping the workflow do double duty: they become the eval set that proves the agent matches the old system. Before each step up the ramp, run the agent against that set and confirm it handles every known case — especially the gnarly exceptions the human used to catch by instinct. Parity is not "the agent works on the happy path." Parity is "the agent handles everything the old workflow handled, including the rare and the ugly."

This is where the migration connects back to evals: a migration without a parity eval is just a hopeful swap. The team treated each historical exception as a test case the agent had to pass, and the moment it could pass all of them consistently was the moment they trusted it with the next ramp step. The eval set was both the proof of readiness and the early-warning system if a later change regressed behavior.

## Decide what stays human

Finally, a safe migration is honest about what should not be fully automated yet. Some decisions in the old workflow carried enough risk — irreversible actions, high-dollar consequences, customer-facing communications — that the team chose to keep a human approving them even after the agent did the work. The agent drafted; a person confirmed. This is not a failure of the migration; it is a sensible boundary that let them ship the automation for ninety percent of the workflow without betting the risky ten percent on a system still earning trust.

Over time, as the agent built a track record on the lower-risk parts, the team could revisit those boundaries and automate more. The lesson was to migrate in trust increments: move the safe parts first, prove them, and let the agent earn its way into the consequential decisions rather than being handed them on day one. A migration that respects that ordering is one that ships and stays shipped.

## Frequently asked questions

### What is shadow mode in an agent migration?

Shadow mode runs the new agent alongside the existing workflow on the same real inputs but does not let it act — the old process stays in charge. You compare the agent's decisions against the baseline to find divergences while they are harmless, fixing the agent or documenting the gap before it ever touches real traffic.

### How do I migrate a workflow to an agent safely?

Map the full workflow including unwritten exceptions, run the agent in shadow mode to prove parity, then cut over incrementally with a canary handling a small percentage of real cases. Watch metrics at each step, keep a tested rollback ready, and ramp the agent's share only as confidence grows. Never do a big-bang replacement.

### Why keep the old workflow running during migration?

Because agent behavior can shift subtly with a model update, a changed tool response, or an unexpected input. Keeping the old path fully operational and one switch away gives you a tested, instant rollback, so a surprise costs minutes instead of becoming a customer-facing incident. Retire the old path only after the agent handles a full cycle of real variety.

### Should every step of a migrated workflow be automated?

No. Keep a human approving the highest-risk decisions — irreversible actions, high-dollar consequences, sensitive customer communications — even after the agent does the work. Migrate in trust increments: automate the safe parts first, prove them, and let the agent earn its way into consequential decisions over time.

## Bringing agentic AI to your phone lines

Moving live call handling onto an agent demands the same care — shadow runs, canaries, and instant rollback. CallSphere migrates phone and chat workflows onto **voice and chat** agents incrementally, proving parity before the agent answers every call and books work on its own. See it live at [callsphere.ai](https://callsphere.ai).

---

*Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.*

---

Source: https://callsphere.ai/blog/migrating-a-workflow-to-claude-code-agents-without-breaking-it
