By Sagar Shankaran, Founder of CallSphere
Most HIPAA-eligible AI deployments depend on one operational pattern: zero-data-retention endpoints with explicit training-data exclusion. Here is the contract language, the technical pattern, and the gotchas.
Key takeaways
"We don't train on your data" is the most common AI-vendor sales line in healthcare. The HIPAA-defensible version of that promise is in writing, in the BAA, with a specific endpoint and a specific retention number.
flowchart TD
In[Patient interaction] --> MinNec{Minimum necessary?}
MinNec -->|yes| Process[AI process]
MinNec -->|no| Reject[Block + log]
Process --> Encrypt[(AES-256 at rest)]
Encrypt --> DB[(PostgreSQL)]
Process --> Audit[(Audit trail)]
DB --> Right[Right of access §164.524]HIPAA does not use the word "training" but the rules cover the underlying activities directly. A use or disclosure of PHI for purposes other than treatment, payment, or operations requires authorization under 45 CFR 164.508 unless an exception applies. Using PHI to train a model that benefits other customers is not treatment, payment, or operations of the originating covered entity — it is an other purpose, requiring authorization. The Privacy Rule's minimum-necessary standard at 45 CFR 164.502(b) further constrains the data made available even within permitted purposes.
The BAA at 45 CFR 164.504(e)(2) must "establish the permitted and required uses and disclosures" of PHI by the business associate. Silence on training is not permission — but ambiguous "improvement of services" language has been read by some vendors as permission. The defensive pattern is explicit: the BAA names training, fine-tuning, and evaluation as not permitted, with a separate de-identified-data clause carving out the limited research uses the covered entity authorizes (or not).
The proposed 2026 Security Rule update reinforces the technical posture: encryption at rest and in transit, MFA, and annual third-party verification of safeguards. None of that means anything if PHI persists in a training corpus.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
The zero-retention pattern is the operational shape of training-data exclusion. OpenAI's API offers zero-data-retention (ZDR) on eligible endpoints under a signed BAA — prompts and completions are processed in memory and not logged. Anthropic offers zero-data-retention on eligible API features, with retained data not used for training without express permission. AWS Bedrock does not log prompts or responses by default and does not use them for training; the AWS BAA covers Bedrock and listed sub-services. Google Vertex AI under the Google Cloud BAA disables data logging for HIPAA-eligible projects with the regulated-data flag enabled.
The configuration matters. The endpoint, project flag, and account-level setting all need to align. A regular API key on a regular OpenAI account does not get ZDR — even with a BAA. The BAA scopes to the BAA-eligible endpoint; using a non-eligible endpoint is a breach even with the contract signed.
CallSphere routes every healthcare prompt through BAA-eligible endpoints with zero-data-retention configured. Our standard BAA includes explicit non-permission for training, fine-tuning, and evaluation on customer PHI; explicit non-permission for cross-customer data pooling; and explicit return-or-destroy at termination. Sub-processor BAAs flow the same language down. The audit trail records the model provider, model name, BAA reference, ZDR status flag, and prompt token counts on every inference call across our 90+ tools and 115+ tables. Across 50+ deployed businesses, we have not had a single training-data exposure incident. Healthcare buyers can review the model-provider stance at /industries/healthcare, see pricing on /pricing, and start with a 14-day trial. Behavioral-health customers should also see /lp/behavioral-health.
Is "zero data retention" a HIPAA term? No. It is a vendor configuration term. HIPAA's permitted-uses framework at 45 CFR 164.504(e)(2) is the underlying control; ZDR is the operational shape that satisfies it.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Can I fine-tune on de-identified data? Yes if the data is genuinely de-identified under 45 CFR 164.514(a) (Safe Harbor or Expert Determination). De-identified data is no longer PHI. AI conversation logs need careful de-identification — covered separately.
Is a vendor's "we don't train on customer data" page enough? No. It needs to be in the BAA. Public web pages change without notice; BAAs do not.
What about "model improvement" language? "Model improvement" is ambiguous and has been used by vendors to argue training is permitted. Strike or define narrowly in the BAA.
Does the proposed Security Rule mention training? The NPRM does not name training directly but the asset inventory and risk analysis requirements force enumeration of every place PHI lands — including inference and any training pipeline.
Written by
Sagar Shankaran· Founder, CallSphere
Sagar Shankaran is the founder of CallSphere, where he builds production AI voice and chat agents deployed across healthcare, hospitality, real estate, and home services. He writes about agentic AI, LLM engineering, and shipping voice agents that handle real calls in production.
See how AI voice agents work for your industry. Live demo available -- no signup required.
Using GPT-Realtime-2 for healthcare voice agents. BAA scope, PHI handling, retention, logging, and why a managed platform usually wins this build.
The 2024 NPRM proposes mandatory penetration tests every 12 months and vulnerability scans every 6 months. Here is how an AI voice agent should be tested in 2026.
AI voice and chat logs are a treasure trove for analytics and a liability landmine for HIPAA. Here is how the two de-identification methods at 45 CFR 164.514 actually apply to multi-turn AI transcripts.
Dental practices have HIPAA-aligned obligations and a uniquely high-volume recall and insurance-verification workload. The AI agent that handles both is the highest-ROI build in 2026 — if it is wired correctly.
Healthcare Practice Use Case perspective on Harvey AI's enterprise rollout numbers show legal agents have moved past the pilot stage at AmLaw 100 firms.
Healthcare Practice Use Case perspective on Comet's general-availability launch put an agentic browser in front of millions of consumers, and it works better than the demos suggested.
© 2026 CallSphere LLC. All rights reserved.