Skip to content
Agentic AI
Agentic AI7 min read0 views

Claude Code walkthrough: a PM ships a booking app

A realistic week-by-week story of a non-technical PM shipping a real booking app with Claude Code in six weeks — from blank page to live users.

Abstract advice about agentic development only goes so far. To understand what it actually feels like for a non-technical PM to ship with Claude Code, it helps to follow one project from the blank page to live users. So let's walk through a realistic build: a small appointment-booking app for a regional service business that needs customers to self-schedule online, staff to manage availability, and automatic reminders to cut no-shows. It is exactly the kind of project that used to need a small engineering team and a quarter. We will compress it into six weeks, the way these projects really run.

The PM in this story has shipped products before and can read code at a basic level, but has never built software themselves. Everything below is the kind of decision and friction these projects actually contain — including the parts that go wrong, because the smooth-sailing version teaches nothing.

Week one: specification and the skeleton

The PM does not start by asking Claude Code to "build a booking app." They start by writing down behavior. Customers see available time slots and book one with their name and email. Double-booking must be impossible. Staff log in to set their weekly availability and see upcoming appointments. The system emails a confirmation and a reminder. Each of these becomes a precise spec, because the PM learned that ambiguity is where the agent invents things.

With the spec written, the PM asks Claude Code to scaffold the project: a web frontend, an API, and a database, with a recommendation on the stack. The agent proposes a conventional setup and explains each piece when asked. By the end of week one there is a running skeleton — pages that load, a database with tables for users, availability, and appointments, and nothing that works yet. Crucially, the PM has clicked through it locally and can name what every part does. The literacy investment from earlier pays off immediately.

Weeks two and three: the core loop and the first real bug

This is where the app becomes real. The PM directs Claude Code to build the booking flow: show open slots, let a customer pick one, save the appointment, prevent double-booking. The double-booking requirement is the first genuinely tricky moment, and it is worth watching how it gets handled.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
flowchart TD
  A["Customer picks a time slot"] --> B["API checks slot availability"]
  B --> C{"Slot still free?"}
  C -->|No| D["Show 'just taken' message"]
  C -->|Yes| E["Reserve slot in a transaction"]
  E --> F{"Two requests at once?"}
  F -->|Yes| G["DB constraint rejects the loser"]
  F -->|No| H["Confirm booking"]
  G --> D
  H --> I["Send confirmation email"]

The agent's first version checks whether a slot is free and then saves the booking. It works in testing. But the PM, having developed verification instinct, asks the right question: "what happens if two customers book the same slot at the exact same second?" The agent acknowledges the race condition and rewrites the logic to use a database-level uniqueness constraint and a transaction, so the database itself refuses the second booking. This is the moment the project succeeds or fails — not because the PM knew how to fix a race condition, but because they knew to ask about it and could evaluate the answer. The agent supplied the fix; the PM supplied the judgment.

By the end of week three the core loop works and is tested. The PM has Claude Code write tests for the booking path, including the concurrent-booking case, and runs them. Staff login and availability management land in the same window, slightly ahead of plan because the patterns are now familiar.

Week four: the unglamorous middle

Week four is reminders, error handling, and the parts users never thank you for. The PM wires up email sending through a transactional email service, with Claude Code explaining how to store the API key safely in a secret manager rather than in the code — a security reflex the PM has internalized. The reminder system needs a scheduled job, and here the PM applies risk discipline: they ask the agent how often it runs and confirm it cannot spiral into thousands of paid email sends. A spending cap goes on the email service before the feature ships.

Error handling is the other theme. What does a customer see if the email fails to send? What if they lose their connection mid-booking? The agent handles these gracefully when asked, but it is the PM's job to ask. Week four feels slow and slightly tedious, which is exactly right — this is the work that separates a demo from a product, and it is where many PM-led projects underinvest and pay for it later.

Weeks five and six: hardening and going live

Week five is hardening. The PM deliberately tries to break the app: booking with a malformed email, accessing the staff pages while logged out, changing IDs in URLs to peek at other appointments. Each probe surfaces something to fix, and each fix is small because the foundation is sound. A guardian engineer spends a few hours reviewing the security-sensitive code — authentication, data access, payment if any — and flags two issues the agent and PM had missed. This review is the highest-leverage few hours in the whole project.

Week six is deployment and launch. Claude Code walks the PM through hosting the app, configuring the production database, setting environment variables, and taking a first backup. The PM writes a one-page rollback playbook and rehearses it once on a copy. Then the app goes live to a small group of real customers, watched closely. There are a couple of small fixes in the first days — a confirmation email landing in spam, a timezone display quirk — handled in minutes because the loop is now second nature. Six weeks in, a non-technical PM has a live, tested, monitored booking app that real people use. Not a toy. A product.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Frequently asked questions

Is six weeks realistic for a non-technical person?

For an app of this scope — a focused tool with a clear core loop — yes, for a PM who invested in literacy first and kept scope disciplined. The timeline blows up when the PM cannot evaluate the agent's output or keeps expanding the feature list mid-build.

What was the highest-risk moment in this build?

The double-booking race condition, because it passes casual testing and only fails under real concurrent load. The project succeeded because the PM had developed the instinct to ask about it. That single question is the difference between a reliable app and one that corrupts data in production.

How much did the guardian engineer actually contribute?

A few hours, but disproportionately valuable hours. Reviewing security-sensitive code before launch caught issues the PM and agent had missed. You do not need a full engineering team, but for anything touching real user data, expert review of the risky parts before going live is well worth it.

What would have made this fail?

Skipping the verification habit and trusting the demo, underinvesting in week four's unglamorous error handling, or letting scope balloon. The agent rarely causes failure on its own; failure comes from the PM not directing and reviewing it well.

Bringing this end-to-end pattern to your phone lines

The same problem-to-shipped arc — spec, build, verify, launch — is how CallSphere deploys agents for voice and chat. Our assistants answer every call and message, use tools like booking systems mid-conversation, and schedule work 24/7. See a live example at callsphere.ai.


Source & attribution: This is an independent, original explainer inspired by Anthropic's coverage on the Claude blog. Claude, Claude Code, Claude Cowork, Claude Opus, and the Model Context Protocol are products and trademarks of Anthropic. CallSphere is not affiliated with or endorsed by Anthropic.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.