By Sagar Shankaran, Founder of CallSphere
Anthropic acquires Vercept, the Seattle-based AI startup behind the Vy desktop agent, to push Claude's computer use from 15% to 72.5% on OSWorld benchmarks.
Key takeaways
Anthropic announced the acquisition of Seattle AI startup Vercept on February 25, 2026, folding its desktop "computer use" technology and team into Claude as the race to build AI agents that can operate software intensifies.
Founded by alumni of the Allen Institute for AI, Vercept had created tools for complex agentic tasks, including its product Vy — a computer-use agent in the cloud that could operate a remote Apple MacBook. The startup had raised $16 million in a June 2025 seed round led by Fifty Years, with investors including Eric Schmidt and Jeff Dean.
Vercept's nine-person team, including co-founders Kiana Ehsani, Luca Weihs, and Ross Girshick, will join Anthropic after winding down external products. Vy will shut down in 30 days as part of the transition.
flowchart TD
HUB(("Claude's Computer Use<br/>Gets a Boost"))
HUB --> L0["About Vercept"]
style L0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L1["The Team"]
style L1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L2["Impact on Claude"]
style L2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L3["Strategic Context"]
style L3 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style HUB fill:#4f46e5,stroke:#4338ca,color:#fff
flowchart LR
IN(["Input prompt"])
subgraph PRE["Pre processing"]
TOK["Tokenize"]
EMB["Embed"]
end
subgraph CORE["Model Core"]
ATTN["Self attention layers"]
MLP["Feed forward layers"]
end
subgraph POST["Post processing"]
SAMP["Sampling"]
DETOK["Detokenize"]
end
OUT(["Generated text"])
IN --> TOK --> EMB --> ATTN --> MLP --> SAMP --> DETOK --> OUT
style IN fill:#f1f5f9,stroke:#64748b,color:#0f172a
style CORE fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
style OUT fill:#059669,stroke:#047857,color:#fff
flowchart TD
HUB(("Claude's Computer Use<br/>Gets a Boost"))
HUB --> L0["About Vercept"]
style L0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L1["The Team"]
style L1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L2["Impact on Claude"]
style L2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
HUB --> L3["Strategic Context"]
style L3 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
style HUB fill:#4f46e5,stroke:#4338ca,color:#fff
The acquisition has already contributed to Claude's dramatic computer use improvement: on OSWorld, a widely-used evaluation for AI computer use, Claude's scores went from under 15% in late 2024 to 72.5% today — approaching human-level performance on tasks like navigating complex spreadsheets and completing web forms.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
This is Anthropic's second acquisition following the Bun runtime purchase in December 2025. Both acquisitions signal Anthropic's aggressive push into agentic AI — systems that don't just generate text but actually operate computers and write software.
Source: Anthropic | TechCrunch | GeekWire | SiliconANGLE
Most coverage of Anthropic Acquires AI Startup Vercept to Supercharge Claude's Computer Use Capabilities stops at the press release. The interesting part is the implementation cost — what changes for a team running 37 agents and 90+ tools in production? The CallSphere stack treats announcements as input to an evals queue, not a product roadmap. Production agents stay pinned; new releases earn their slot only after a regression suite confirms cost, latency, and tool-call reliability move the right way.
Most AI news is noise. A new benchmark score, a leaderboard reshuffle, a leaked memo — none of it changes whether your AI receptionist books appointments without dropping the call. The handful of things that do move production AI voice and chat are concrete: realtime API stability (does the WebSocket survive 5+ minutes without a stall?), language coverage (does it handle 57+ languages with usable accents, or is English the only first-class citizen?), tool-use reliability (does the model actually call the right function with the right argument types under load?), multi-agent handoffs (do specialist agents receive structured context, or just transcripts?), and latency under load (p95 first-token under 800ms when 200 concurrent calls hit the same endpoint?). The CallSphere rule on news is: if it doesn't move at least one of those five numbers in a measurable eval, it's a blog post, not a product change. What to track: provider changelogs for realtime endpoints, tool-call schema changes, language-add announcements, and any deprecation that pins your stack to a sunset date. What to ignore: leaderboard wins on tasks that don't map to your call flow, "agentic" benchmarks that don't measure tool latency, and demos that work because the prompt was hand-tuned for the demo. The teams that ship fastest treat AI news the same way ops teams treat CVE feeds — read everything, act on the small fraction that touches your runtime, archive the rest.
Q: How does anthropic Acquires AI Startup Vercept to Supercharge Claude's Computer Use Capabilities change anything for a production AI voice stack?
A: Most of the time it doesn't, and that's the right starting assumption. The relevant test is whether it improves at least one of: p95 first-token latency, tool-call argument accuracy on noisy inputs, multi-turn handoff stability, or per-session cost. Setup takes 3-5 business days. Pricing is $149 / $499 / $1,499. There's a 14-day trial with no credit card required.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Q: What's the eval gate anthropic Acquires AI Startup Vercept to Supercharge Claude's Computer Use Capabilities would have to pass at CallSphere?
A: The eval gate is unsentimental — a regression suite that simulates real call traffic (noisy ASR, partial inputs, tool-call timeouts) measures four numbers, and a candidate has to win on three of four without losing badly on the fourth. Anything else is treated as a blog post, not a stack change.
Q: Where would anthropic Acquires AI Startup Vercept to Supercharge Claude's Computer Use Capabilities land first in a CallSphere deployment?
A: In a CallSphere deployment, new model and API capabilities land first in the post-call analytics pipeline (lower stakes, async, easy to roll back) and only later in the live realtime path. Today the verticals most likely to absorb new capability first are After-Hours Escalation and IT Helpdesk, which already run the largest share of production traffic.
Want to see it helpdesk agents handle real traffic? Walk through https://urackit.callsphere.tech or grab 20 minutes with the founder: https://calendly.com/sagar-callsphere/new-meeting.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
A founder's guide to the personal AI assistant market: best AI assistant apps, business-grade options, and how CallSphere's voice agent fits in.
A founder's guide to free AI agents, low-code AI agent builders, and how to know when you should pay for a real platform like CallSphere.
Graphiti is the open-source temporal knowledge graph for AI agents in 2026. Learn how bi-temporal memory beats vector RAG for voice agents and long-running LLMs.
Chatbot app vs ChatGPT in 2026: a founder's clear take on the difference, when to use which, and how a real AI chatbot app development works.
How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.
A three-way comparison of Gemini Enterprise, Anthropic managed agents and OpenAI Frontier Platform after Cloud Next 2026 — strengths, gaps, buyer fit.
© 2026 CallSphere LLC. All rights reserved.