"We don't train on your data" is the most common AI-vendor sales line in healthcare. The HIPAA-defensible version of that promise is in writing, in the BAA, with a specific endpoint and a specific retention number.

What the law actually says

flowchart TD
  In[Patient interaction] --> MinNec{Minimum necessary?}
  MinNec -->|yes| Process[AI process]
  MinNec -->|no| Reject[Block + log]
  Process --> Encrypt[(AES-256 at rest)]
  Encrypt --> DB[(PostgreSQL)]
  Process --> Audit[(Audit trail)]
  DB --> Right[Right of access §164.524]

CallSphere reference architecture

HIPAA does not use the word "training" but the rules cover the underlying activities directly. A use or disclosure of PHI for purposes other than treatment, payment, or operations requires authorization under 45 CFR 164.508 unless an exception applies. Using PHI to train a model that benefits other customers is not treatment, payment, or operations of the originating covered entity — it is an other purpose, requiring authorization. The Privacy Rule's minimum-necessary standard at 45 CFR 164.502(b) further constrains the data made available even within permitted purposes.

The BAA at 45 CFR 164.504(e)(2) must "establish the permitted and required uses and disclosures" of PHI by the business associate. Silence on training is not permission — but ambiguous "improvement of services" language has been read by some vendors as permission. The defensive pattern is explicit: the BAA names training, fine-tuning, and evaluation as not permitted, with a separate de-identified-data clause carving out the limited research uses the covered entity authorizes (or not).

The proposed 2026 Security Rule update reinforces the technical posture: encryption at rest and in transit, MFA, and annual third-party verification of safeguards. None of that means anything if PHI persists in a training corpus.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

What this means for AI voice and chat agents

The zero-retention pattern is the operational shape of training-data exclusion. OpenAI's API offers zero-data-retention (ZDR) on eligible endpoints under a signed BAA — prompts and completions are processed in memory and not logged. Anthropic offers zero-data-retention on eligible API features, with retained data not used for training without express permission. AWS Bedrock does not log prompts or responses by default and does not use them for training; the AWS BAA covers Bedrock and listed sub-services. Google Vertex AI under the Google Cloud BAA disables data logging for HIPAA-eligible projects with the regulated-data flag enabled.

The configuration matters. The endpoint, project flag, and account-level setting all need to align. A regular API key on a regular OpenAI account does not get ZDR — even with a BAA. The BAA scopes to the BAA-eligible endpoint; using a non-eligible endpoint is a breach even with the contract signed.

How CallSphere implements

CallSphere routes every healthcare prompt through BAA-eligible endpoints with zero-data-retention configured. Our standard BAA includes explicit non-permission for training, fine-tuning, and evaluation on customer PHI; explicit non-permission for cross-customer data pooling; and explicit return-or-destroy at termination. Sub-processor BAAs flow the same language down. The audit trail records the model provider, model name, BAA reference, ZDR status flag, and prompt token counts on every inference call across our 90+ tools and 115+ tables. Across 50+ deployed businesses, we have not had a single training-data exposure incident. Healthcare buyers can review the model-provider stance at /industries/healthcare, see pricing on /pricing, and start with a 14-day trial. Behavioral-health customers should also see /lp/behavioral-health.

Compliance and build checklist

Add explicit "no training, fine-tuning, or evaluation on PHI" language to the BAA.
Add explicit "no cross-customer data pooling" language to the BAA.
Verify zero-data-retention status on every model endpoint that supports it.
Confirm the model endpoint is BAA-eligible — not just the account.
Sign downstream BAAs with every model and inference sub-processor.
For OpenAI, request ZDR through baa@openai.com or your enterprise contact and store the confirmation.
For AWS Bedrock, accept the AWS BAA in AWS Artifact and apply SCPs that deny non-eligible services.
For Google Vertex AI, enable the regulated-data flag and use VPC Service Controls.
For Anthropic, confirm ZDR eligibility per feature in the API documentation.
Audit inference logs monthly to confirm no PHI persists outside the BAA boundary.
Reconfirm BAA-eligibility lists quarterly — they shift.

FAQ

Is "zero data retention" a HIPAA term? No. It is a vendor configuration term. HIPAA's permitted-uses framework at 45 CFR 164.504(e)(2) is the underlying control; ZDR is the operational shape that satisfies it.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Can I fine-tune on de-identified data? Yes if the data is genuinely de-identified under 45 CFR 164.514(a) (Safe Harbor or Expert Determination). De-identified data is no longer PHI. AI conversation logs need careful de-identification — covered separately.

Is a vendor's "we don't train on customer data" page enough? No. It needs to be in the BAA. Public web pages change without notice; BAAs do not.

What about "model improvement" language? "Model improvement" is ambiguous and has been used by vendors to argue training is permitted. Strike or define narrowly in the BAA.

Does the proposed Security Rule mention training? The NPRM does not name training directly but the asset inventory and risk analysis requirements force enumeration of every place PHI lands — including inference and any training pipeline.

Sources

45 CFR 164.504(e), BAA permitted uses: https://www.ecfr.gov/current/title-45/section-164.504
45 CFR 164.508, Authorizations: https://www.ecfr.gov/current/title-45/section-164.508
OpenAI, Enterprise privacy and HIPAA: https://openai.com/enterprise-privacy
Anthropic, API data retention: https://platform.claude.com/docs/en/build-with-claude/api-and-data-retention
AWS, HIPAA Eligible Services Reference: https://aws.amazon.com/compliance/hipaa-eligible-services-reference/

HIPAA and AI Training Data Exclusion: The Zero-Retention BAA in 2026

What the law actually says

What this means for AI voice and chat agents

How CallSphere implements

Compliance and build checklist

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

GPT-Realtime-2 For Healthcare Voice: HIPAA and BAA Considerations

HIPAA Pen-Test and Risk Assessment for AI Voice in 2026

De-Identifying AI Conversation Logs: Safe Harbor vs Expert Determination

AI Dental Hygiene Recall and Insurance Check: HIPAA for the 2026 Dental Practice

Healthcare Practice Use Case: Harvey AI — Legal Agents Move from Pilot to Practice

Healthcare Practice Use Case: Perplexity Comet — The Agentic Browser Goes Mass Market

Product

Resources

Company

Legal

Industries

Integrations

Solutions

Compare

Pillar Guides