What does an enterprise zero-data-retention (ZDR) agreement with Anthropic actually cover?

ZDR means Anthropic does not store inputs or outputs at rest after the API response is returned, except where needed to comply with law or combat misuse. It applies to eligible Anthropic APIs and to Anthropic products that use the customer's commercial organization API key -- explicitly including Claude Code. ZDR has to be enabled per-organization by Anthropic's account team; it is not the default. Even under ZDR, Anthropic retains User Safety classifier results (the score, not the original content) to enforce its Usage Policy. Microsoft, given its scale and the corporate-IP exposure, almost certainly had ZDR in place; the public-record question is whether the rollout actually used the ZDR-enabled organization key for every Claude Code session, or whether any developer ended up routing through a non-ZDR key by accident.

How long does Anthropic store data without ZDR?

For organizations without a ZDR agreement, Anthropic automatically deletes inputs and outputs from its backend within 30 days of receipt or generation. There are exceptions: longer retention applies if the customer is using a feature with customer-controlled retention, if there is a separate written agreement, or if Anthropic is required to retain by law or to combat misuse. The 30-day window is the standard floor for standard API customers, not the ceiling for the Trust-and-Safety classifier metadata, which Anthropic retains separately and indefinitely for usage-policy enforcement.

Where does Claude Code store session transcripts locally?

Claude Code clients write session transcripts to the local filesystem at ~/.claude/projects/ in plaintext JSONL format. The retention window on the client side is 30 days. This is independent of the Anthropic backend retention policy and applies regardless of whether the organization has ZDR. For Microsoft's internal pilot, every developer who used Claude Code over 6 months has accumulated up to 30 days of plaintext transcripts on their work laptop at any given snapshot. Post-cancellation, those transcripts persist locally until the rolling 30-day window expires, and any backup snapshots taken during the pilot retain them indefinitely.

What kinds of data actually went over the wire during a Microsoft Claude Code session?

Every Claude Code session sends: the user's prompt (frequently containing intent, ticket numbers, internal-project names), the agent-injected repository context (file contents, directory listings, git history snippets, often hundreds of thousands of tokens for agent-mode workflows), the tool-call sequence (shell commands, file reads, search queries), and the assistant's responses (generated code, refactor plans, debug analyses). For an enterprise pilot with hundreds of engineers, this aggregates into a substantial fraction of the codebase visible-in-context over the pilot's duration. The actual question for Microsoft's post-cancellation audit is not whether sensitive code went over -- it certainly did -- but which projects, which engineers' working sets, and which credential-adjacent files appeared in agent-injected context windows.

Was Microsoft's data used to train future Claude models?

Under Anthropic's standard commercial-API terms, customer data is not used to train Anthropic's models by default. Enterprise customers and ZDR-tier customers have stronger contractual non-training guarantees. Microsoft, given its commercial relationship and ZDR posture, almost certainly contracted for non-training treatment. The relevant audit question is whether the no-training term has appropriate technical attestation -- can Anthropic point to specific dataset-construction logs showing the customer ID was excluded from training pulls? -- and whether any internal Microsoft developer routed sessions through a personal Anthropic account (consumer Claude.ai or non-commercial API key) where the training-data treatment is different and looser.

What does the Microsoft post-cancellation audit checklist look like?

The minimum-viable audit covers six categories. First, attest with Anthropic that the ZDR agreement was active and covered every API key in scope for the entire 6-month pilot window. Second, deprovision every Anthropic API key issued during the pilot, including any developer-level keys. Third, sweep all developer laptops for ~/.claude/projects/ transcripts older than the 30-day rolling window and rotate any keys/secrets that appear in those transcripts. Fourth, audit backup-snapshot inventory for any laptop backups taken during the pilot window and apply the same secret-rotation discipline. Fifth, request from Anthropic the User Safety classifier metadata that was retained post-ZDR for completeness of the data-handling record. Sixth, document the chain-of-custody for everything above into the internal AI-risk register.

What is the per-engineer local-disk exposure on Claude Code?

Each engineer running Claude Code has up to 30 days of session transcripts under ~/.claude/projects/ in plaintext JSONL. A typical engineer's transcripts include working file contents, shell command sequences (sometimes including misclicked credential paste), git diff context, design discussion, and any tool-call output the agent ingested during a session. On a multi-tenant or shared-laptop scenario, that disk content is reachable by any process with the engineer's filesystem permissions. The forensic exposure is non-trivial: an attacker with even brief physical or shell-level laptop access during the rolling window captures a much richer profile of the engineer's work than a single point-in-time file dump would yield.

How does this compare to the Samsung ChatGPT incident in 2023?

The Samsung incident was three discrete employee actions inside a 20-day window in March 2023: an engineer pasted faulty source code to ChatGPT to get a fix, another pasted equipment-defect identification code for optimization, a third pasted a meeting transcript for summarization. Samsung banned all generative AI tools immediately. The Microsoft Claude Code situation is structurally different in that the data flow was sanctioned, enterprise-contracted, and presumably ZDR-covered -- but six months of broad adoption produced a much larger cumulative data flow than three discrete Samsung incidents. The audit posture should be more disciplined precisely because the volume is much larger and because, paradoxically, the ZDR contractual protection has historically made internal teams less rigorous about the local-disk exposure that the contract does not cover.

What are the right offboarding controls for sunsetting an AI vendor like this?

Seven controls form the operational baseline. First, an attested deprovisioning of every API key in scope, with timestamped vendor-side confirmation. Second, a documented sweep of every endpoint (developer laptops, CI runners, build servers) for vendor SDK artifacts, cached credentials, and local transcripts. Third, secret rotation for everything that appeared in vendor-side context (credentials, OAuth tokens, internal URLs). Fourth, retention-policy verification with the vendor that backend deletion windows have actually elapsed and been attested. Fifth, audit-log forwarding from the vendor for the pilot window to internal SIEM for the retention period the org's audit framework requires. Sixth, an internal red-team exercise simulating a leak from the vendor's data to test the secret-rotation completeness. Seventh, a post-mortem documenting what the pilot did and did not cover, archived to the org's AI-risk register.

AI Vendor Risk

Microsoft's 6-Month Claude Code Exposure: What an Audit Looks Like Post-Cancellation

Microsoft will pull internal Claude Code access on June 30, 2026, after six months of broad engineer-driven adoption. The data-flow question is not whether sensitive code went over -- it certainly did -- but what the auditable trail looks like across Anthropic's ZDR contract, the per-engineer local-disk transcript store, and the offboarding controls that should already be in motion. Field notes on the actual exposure shape and the audit checklist that gets you to closed-out status.

By Austa · Published May 27, 2026 · ~11 min read

The short version. Microsoft's internal Claude Code pilot ran from December 2025 to June 30, 2026. Anthropic's ZDR tier means inputs and outputs are not persisted on their backend past the API response -- assuming ZDR was correctly applied to every key in scope. The exposure surface that is NOT covered by ZDR is the local Claude Code client transcript store at ~/.claude/projects/ on every developer laptop. Six months of broad adoption means every Microsoft engineer's working machine has accumulated rolling 30-day transcripts in plaintext. The post-cancellation audit has to address the Anthropic side AND the local-disk side AND the backup-snapshot side, and most internal-AI-risk programs in 2026 still under-cover the latter two.

The pilot, in plain language

Microsoft sub-licensed Claude Code from Anthropic in December 2025 under a token-metered API contract. The program was nominally a productivity experiment; in practice it became the dominant AI coding tool inside the company within a quarter. The financial side -- which forced the program to sunset, covered in detail on our sister analysis -- is the headline most security teams will read first. The data-handling side is the part that the security team should have been reading first, and the part that determines what gets cleaned up between now and June 30.

The relevant facts:

Duration: six full months of broad availability, December 2025 through June 30, 2026.
Adoption: Microsoft has not published the exact engineer count enrolled in the pilot, but contemporaneous reporting from The Verge's Tom Warren and Windows Central characterized the pilot as having become "perhaps a little too popular" -- in practice, hundreds to low thousands of engineers across product groups.
Workflow: Claude Code's headline value is agent-mode workflows. A single agent-mode session can plan over hundreds of thousands of tokens of repository context, iterate 15-30 tool calls per task, and generate tens of thousands of output tokens per refactor. The aggregate exposure across the pilot is, in pure token volume, very large.
Termination: June 30, 2026 cutoff; engineers redirect to GitHub Copilot.

The audit question is therefore not "did sensitive material go over." It did. It is "what was the architectural envelope around the data flow, what does the contract say got persisted, and what gets persisted on infrastructure the contract does not cover."

What Anthropic's data-handling contract actually says

The Anthropic data-retention model has two tiers that matter here.

Standard tier (no ZDR)

For organizations on the standard commercial API, Anthropic automatically deletes inputs and outputs from its backend within 30 days of receipt or generation. The 30-day window is the floor, not the ceiling. Exceptions: longer retention for features with customer-controlled retention (e.g., Files API with customer-set TTL), longer retention under a separate written agreement, and indefinite retention of User Safety classifier results -- not the original content, but the classifier scores -- for usage-policy enforcement.

Zero Data Retention (ZDR)

ZDR is an enterprise contract addendum, enabled per-organization by Anthropic's account team. Under ZDR, Anthropic does not store inputs or outputs at rest after the API response is returned, except where law or misuse-combat requires retention. ZDR applies to eligible Anthropic APIs and to Anthropic products that use the customer's commercial organization API key, which includes Claude Code. ZDR is not default; it requires explicit setup. Even under ZDR, Anthropic retains User Safety classifier metadata.

Microsoft, given the scale of the program and the type of IP being exposed, almost certainly contracted for ZDR. The public-record question for the audit is not "did Microsoft sign a ZDR addendum" -- they did -- but "was the ZDR-enabled organization key actually used for every Claude Code session in scope, with no developer-side fallback path to a non-ZDR account."

The four exposure surfaces an audit has to cover

1. Anthropic backend (covered by ZDR if correctly enrolled)

Under a properly-applied ZDR contract, the Anthropic backend stores nothing past the API response. The audit verification is contractual + procedural: get a vendor attestation that every API key used during the pilot was enrolled in ZDR for the full window, with timestamped enrollment proof. Most enterprise vendor relationships will produce this on request. The procedural concern is shadow-key creation -- developers who, for any reason, generated a key off a non-enterprise account and used it for actual work.

2. Anthropic User Safety classifier metadata (retained even under ZDR)

This is the surface the audit usually under-covers. Anthropic retains classifier scores -- not the original content, but the trust-and-safety signals derived from it -- indefinitely for usage-policy enforcement. The classifier-metadata exposure is small in volume but non-zero in signal: a classifier score saying "this organization triggered the malware-related classifier 47 times" is itself sensitive intelligence about what kinds of code Microsoft engineers were having Claude help with. The audit should formally request the classifier-metadata summary that Anthropic retains under the pilot, document it, and make a defensible call about whether to negotiate its scoped deletion.

3. Local Claude Code client transcripts at `~/.claude/projects/` (NOT covered by ZDR)

This is the surface most internal audits miss. Claude Code clients write session transcripts to local disk at ~/.claude/projects/ in plaintext JSONL format. The client-side retention window is 30 days rolling. The transcripts include the full prompt history, the agent-injected repository context, every tool call with full arguments, and every assistant response.

For a six-month pilot, every engineer who used Claude Code has at any snapshot up to 30 days of detailed working-context transcripts on their work laptop. After the pilot ends, those transcripts persist locally for the next 30 days as the rolling window expires. Crucially, backup snapshots taken during the pilot window retain those transcripts indefinitely, regardless of the rolling local-window expiration.

The audit-side work here is substantial:

Sweep every issued laptop for ~/.claude/projects/ and quantify the snapshot.
Identify which transcripts contain credential-shaped strings (API keys, tokens, SSH keys, internal URLs, AWS access keys) using a DLP-style scanner.
Rotate every credential detected in transcripts.
Cross-reference local transcripts against the org's backup-retention policy -- if laptop backups are taken during the pilot window, those backups retain the transcripts indefinitely.
Apply secret-rotation discipline to credentials detected in backup snapshots, not just live transcripts.

4. Lateral exposure through engineers' personal accounts

The pilot's official API key path had ZDR. The audit question is what happens to engineers who, for any reason, used Claude.ai (the consumer-facing chat product) on personal accounts during the pilot window. Consumer Claude.ai has a different data-handling regime, including separate consumer-terms retention windows and -- absent specific opt-out -- training-data eligibility. Engineers who pasted internal code into Claude.ai consumer chat are a separate exposure category. The Samsung 2023 incident was structurally this category, multiplied across three employees in 20 days.

The audit signal here is mostly indirect. The org can: (a) survey engineers post-cancellation about consumer-tier usage during the pilot, (b) sweep work laptops for browser bookmarks/cookies to claude.ai signing in with personal-email accounts, and (c) revoke / advise on retention-clearing for any consumer accounts detected.

The six-control audit checklist

Stripped to operational items, the post-cancellation audit covers six categories. Each one has a deliverable, an evidence trail, and a residual-risk rating to feed into the internal AI-risk register.

Control 1: ZDR scope attestation

Deliverable: written attestation from Anthropic confirming every API key issued under Microsoft's commercial organization during the pilot was ZDR-enabled, with the date range of ZDR coverage. Evidence: the attestation document plus the internal key-inventory snapshot from the same window. Residual risk: medium -- the attestation is reliant on Anthropic's internal account records being complete. Get it in writing.

Control 2: Key deprovisioning

Deliverable: every API key issued during the pilot is deprovisioned post-June 30, with timestamped revocation. Evidence: vendor revocation log + internal SIEM correlation showing zero post-revocation API calls. Residual risk: low -- this is mechanical and easily verified.

Control 3: Local-disk transcript sweep

Deliverable: every issued laptop sweeps ~/.claude/projects/ for plaintext transcripts. Files inventoried, hashed, and DLP-scanned for credential-shaped content. Credentials detected are rotated. Transcripts are either deleted or retained under the org's standard secure-data-handling policy. Evidence: per-laptop sweep report + credential-rotation log. Residual risk: medium -- depends on completeness of laptop inventory and on how aggressively the rolling-window window has been allowed to drift before the sweep.

Control 4: Backup-snapshot rotation

Deliverable: any laptop-backup snapshots taken during the pilot window are inventoried. Snapshots containing ~/.claude/projects/ data are processed under the same credential-rotation discipline as Control 3. Snapshots are either re-taken post-cleanup or retained-with-documentation in the org's compliance vault. Evidence: backup-snapshot inventory + handling decision per snapshot. Residual risk: high if backups are not actively managed; medium otherwise.

Control 5: Classifier-metadata documentation

Deliverable: Anthropic-side User Safety classifier metadata retained under the pilot is requested, documented, and entered into the AI-risk register. The audit makes a documented call on whether to negotiate scoped deletion (typically not feasible) or to accept-and-document the retention. Evidence: Anthropic's classifier-metadata summary + the org's risk-register entry. Residual risk: low to medium -- the signal is small in volume.

Control 6: Consumer-tier usage survey

Deliverable: a survey of engineers covering whether and how often they used Claude.ai consumer-tier during the pilot. Combined with a laptop browser-history sweep where allowed. Detected consumer-account usage is flagged for advisory action (retention-clear on the consumer account, secret-rotation if any credentials were involved). Evidence: survey response set + browser-history sweep results. Residual risk: medium to high -- consumer-tier exposure is the part of the data flow least under the org's control.

What the Samsung 2023 incident does and does not teach

The Samsung ChatGPT incident in March 2023 is the most-cited comparable. Three incidents in 20 days: an engineer pasted faulty source code for a fix, another pasted equipment-defect-identification code for optimization, a third pasted a meeting transcript for summarization. Samsung banned all generative AI immediately. JPMorgan Chase, Apple, Verizon, Deutsche Bank, Goldman Sachs, and Citigroup followed with their own restrictions during 2023.

The Microsoft Claude Code situation is structurally different in important ways.

Sanctioned vs unsanctioned. Samsung was unsanctioned consumer-tier use; Microsoft was sanctioned enterprise-tier use under contract. The data-handling profile is materially different: Microsoft had ZDR; Samsung had consumer Terms of Service.

Scale. Samsung was three incidents involving distinct employees; Microsoft is six months of broad adoption involving hundreds to low thousands of engineers. The cumulative data flow is several orders of magnitude larger.

Trust posture. Samsung's mid-incident response was reactive: ban everything. Microsoft's June 30 sunset is proactive, driven by the budget rather than by a security incident, which means the security team has time to run a methodical audit rather than scrambling for damage control.

The paradoxical part is that the ZDR contractual protection has historically made internal security teams less rigorous about the local-disk and backup-snapshot exposure that the contract does not cover. The Samsung incident is sometimes invoked to argue "we are safe because we have ZDR." The Samsung incident was 100% unsanctioned-tier-only. ZDR has no bearing on local-client transcript exposure. The two incidents are not directly comparable, and a clean Microsoft audit closes them off as separate risk categories.

What the rest of the industry should be doing

Microsoft's Claude Code sunset is also a signal event for every other engineering org that adopted aggressive coding-agent tooling during the 2024-2026 ramp. Three pieces of advice apply regardless of which vendor the org is on.

First, treat ZDR as necessary but not sufficient. ZDR addresses one surface (vendor backend) out of four (vendor backend, vendor safety-classifier metadata, local client, local backup). The org's AI-risk register should track all four.

Second, document the local-client transcript story at adoption time, not at offboarding time. Every coding agent worth adopting has some form of local session persistence -- Claude Code's ~/.claude/projects/, Cursor's per-project history, Windsurf's local cache. The DLP and backup-handling policies for those directories should be defined when the agent rolls out, not when it gets sunset.

Third, plan the offboarding controls before signing the onboarding contract. The six-control checklist in this article should be in the security team's vendor-onboarding template, with the deliverable for each control specified in advance. The reason orgs scramble at offboarding is that the controls were not pre-defined; the offboarding scope is reconstructed from incomplete telemetry rather than executed against a checklist.

The Microsoft sunset will be cited in every AI-vendor-risk conversation for the rest of 2026. The shape of that citation will depend on what the post-cancellation audit produces. Done right, it becomes the playbook other orgs adopt. Done wrong, it becomes the cautionary example that every internal AI-risk report opens with for the next three years.