Skip to content
Agentic AI
Agentic AI8 min read0 views

How to Deploy Claude Cowork: A Walkthrough

Step-by-step Claude Cowork deployment: build a read-only MCP connector, write and test a skill, bundle a role plugin, and pilot safely with one team.

You've been handed the project: "Get Claude Cowork working for the operations team by end of quarter." The architecture diagrams are interesting, but right now you need a sequence of concrete moves — what to build first, what to test, and how to avoid the rollout that quietly grants every analyst write access to the production ERP. This is that walkthrough. We'll go from an empty workspace to a scoped, audited deployment for one team, in the order an engineer would actually do it, with the commands and config shapes you'll touch along the way.

Key takeaways

  • Build from the inside out: connector first, then a skill that uses it, then a plugin that bundles them, then role assignment.
  • Start every connector read-only; earn write access deliberately and per-action.
  • Test each layer in isolation before composing — a broken connector is invisible once it's buried under a plugin.
  • Pilot with one team and connector-level audit logging on before you publish to the org.
  • Write skill descriptions for the model's trigger logic, not for humans — that one line decides whether the skill ever fires.

Step 1: Stand up and verify one MCP connector

Everything Cowork does to your systems flows through an MCP connector, so it is the first thing to build and the first thing to verify in isolation. Start with the single system the operations team needs most — say, your ticketing platform. Define the connector with the narrowest credentials that still let it read what it needs. Before wiring it into anything else, confirm it lists its tools correctly.

{
  "name": "ops-tickets",
  "type": "mcp",
  "transport": "http",
  "url": "https://mcp.internal.example.com/tickets",
  "auth": { "mode": "oauth", "scopes": ["tickets:read"] },
  "tools_allow": ["search_tickets", "get_ticket"]
}

Note the deliberate choices: a read-only scope and an explicit tools_allow list. Even if the server exposes a close_ticket tool, the connector won't surface it. This is your first and cheapest safety control — exclude write tools until a real workflow demands them.

Step 2: Confirm the connector answers before going further

Do not build a skill on top of an unverified connector. Drive one round-trip and read the structured result yourself. The point is to see the actual shape of the data the model will receive, because that shape is what your skill instructions will have to reference.

cowork connector test ops-tickets \
  --tool search_tickets \
  --args '{"status":"open","queue":"operations","limit":3}'

If this returns three well-formed tickets, the connector is real. If it returns an auth error or an empty array, fix that now — it will be far harder to diagnose once a plugin and a skill are layered on top and the failure shows up as "the agent just says it can't find anything."

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Spend a minute reading the field names and value types in that response. Those exact names are what your skill instructions will reference, and what the model will reason over. If the payload nests the severity under a confusing key or returns dates as opaque integers, this is the cheapest moment to normalize it — either at the connector or by documenting the quirk in the skill — rather than after three skills have hard-coded the awkward shape. A connector you understand at the field level is one you can build on confidently.

flowchart TD
  A["Build connector read-only"] --> B["Test connector in isolation"]
  B -->|Fails| A
  B -->|Works| C["Write skill that uses it"]
  C --> D["Dry-run skill on sample task"]
  D -->|Wrong tool fires| C
  D -->|Good| E["Bundle into role plugin"]
  E --> F["Pilot with one team + audit log"]
  F -->|Issues| C
  F -->|Clean| G["Publish to org"]

Step 3: Write the first skill

A skill is a folder of instructions Cowork loads when a task matches its description. For the ops team, start with something concrete like a triage procedure. The folder holds a description, the step-by-step instructions, and any reference data. The single most important line is the description, because the model uses it to decide whether to load the skill at all.

# skill.md (operations-triage)
---
name: operations-triage
description: Triage open operations tickets, group by severity, and draft a daily summary for the ops lead.
---

When asked to triage operations tickets:
1. Call search_tickets with status=open, queue=operations.
2. Group results by severity; flag any breaching SLA.
3. Draft a summary: counts per severity, the top 3 at-risk tickets, and recommended next action.
Never close or modify a ticket; this skill is read and summarize only.

The last line matters as much as the procedure: it tells the agent its boundaries in plain language, reinforcing the read-only posture you set at the connector. Defense in depth means stating the limit at both the tool layer and the instruction layer.

Step 4: Dry-run the skill before bundling

Run the skill against a representative request and watch which tool it calls and how it reasons. You are checking two things: that the description actually triggers the skill, and that the agent uses the connector the way you intended. If the skill doesn't fire, the description is too vague — rewrite it with the words a user would actually use. If it fires but calls the wrong tool, tighten the instructions.

This dry-run step is where most rollout time should go. A skill that triggers reliably and stays in its lane is worth ten skills that look good on paper but misfire under real phrasing. Iterate here while the blast radius is one test session.

Step 5: Bundle into a role plugin and assign

Now compose the verified pieces into a plugin scoped to the operations role. The plugin bundles the connector and the skill so that assigning it to a user grants a coherent capability rather than a set of disconnected settings. Assign it to the ops role only — not the whole org — and keep finance and legal on their own future plugins.

cowork plugin create ops-daily \
  --skill operations-triage \
  --connector ops-tickets \
  --assign-role operations

At this point a member of the operations team can ask Cowork to "triage today's open tickets" and the whole chain — plugin grants the capability, skill supplies the procedure, connector fetches the data — executes without any of them touching configuration.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Step 6: Pilot with audit logging on

Before publishing org-wide, run a one-week pilot with a handful of ops users and connector-level audit logging enabled. Watch the logs for two things: tool calls you didn't expect, and tasks where the agent gave up. The first tells you a scope is too broad or a skill is misfiring; the second tells you a procedure has a gap. Fix both, then widen the rollout.

Common pitfalls

  • Granting write access on day one. Start read-only. A summarize-and-recommend agent delivers most of the value with none of the risk, and you can add scoped write actions later once you trust the workflow.
  • Skipping isolated connector tests. If you only ever test through the full plugin, a connector failure looks like a model failure, and you'll waste hours blaming the prompt.
  • Vague skill descriptions. The description is the trigger. "Helps with tickets" won't fire; "Triage open operations tickets and draft a daily summary" will.
  • Org-wide first launch. Pilot one team. A misconfigured connector discovered by five users is an incident; discovered by five thousand it's a headline.
  • No audit trail at launch. Turn on connector logging before the pilot, not after the first surprise.

Read-only pilot vs. full write rollout

AspectRead-only pilotFull write rollout
Connector scoperead tools onlyscoped read + write
RiskLowRequires review per action
Time to valueDaysAfter trust is established
Good first targetSummaries, triage, draftsUpdates, approvals, closes

Frequently asked questions

What should I deploy first?

One read-only MCP connector to the single most-used system, verified in isolation. Connectors are where everything touches your data, so a working, narrowly-scoped connector is the foundation for every skill and plugin built on top.

Why test connectors and skills separately?

Because once they're stacked inside a plugin, a failure in either looks identical to the user — "the agent can't do it." Isolated tests tell you exactly which layer broke and save hours of misdirected prompt debugging.

How do I make sure the right skill triggers?

Write the description in the words a real user would type. The model loads a skill based on its one-line description matching the task, so phrasing it like the actual request is what makes triggering reliable.

How small should the first pilot be?

One team, a handful of users, one week, with connector audit logging on. That's enough to surface misconfigurations and procedure gaps while the blast radius is tiny and fixes are cheap.

Bringing agentic AI to your phone lines

The same build-test-pilot rhythm shows up in voice automation. CallSphere uses these agentic patterns for voice and chat — assistants that answer every call, pull data mid-conversation, and book the work 24/7. See it live at callsphere.ai.


Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.