AUSTA | Adversarial Intelligence

AI Vendor Risk

Microsoft's 6-Month Claude Code Exposure: What an Audit Looks Like Post-Cancellation

Microsoft will pull internal Claude Code access on June 30, 2026, after six months of broad engineer-driven adoption. The data-flow question is not whether sensitive code went over -- it certainly did -- but what the auditable trail looks like across Anthropic's ZDR contract, the per-engineer local-disk transcript store, and the offboarding controls that should already be in motion. Field notes on the actual exposure shape and the audit checklist that gets you to closed-out status.

By Austa · Published · ~11 min read

The short version. Microsoft's internal Claude Code pilot ran from December 2025 to June 30, 2026. Anthropic's ZDR tier means inputs and outputs are not persisted on their backend past the API response -- assuming ZDR was correctly applied to every key in scope. The exposure surface that is NOT covered by ZDR is the local Claude Code client transcript store at ~/.claude/projects/ on every developer laptop. Six months of broad adoption means every Microsoft engineer's working machine has accumulated rolling 30-day transcripts in plaintext. The post-cancellation audit has to address the Anthropic side AND the local-disk side AND the backup-snapshot side, and most internal-AI-risk programs in 2026 still under-cover the latter two.

The pilot, in plain language

Microsoft sub-licensed Claude Code from Anthropic in December 2025 under a token-metered API contract. The program was nominally a productivity experiment; in practice it became the dominant AI coding tool inside the company within a quarter. The financial side -- which forced the program to sunset, covered in detail on our sister analysis -- is the headline most security teams will read first. The data-handling side is the part that the security team should have been reading first, and the part that determines what gets cleaned up between now and June 30.

The relevant facts:

The audit question is therefore not "did sensitive material go over." It did. It is "what was the architectural envelope around the data flow, what does the contract say got persisted, and what gets persisted on infrastructure the contract does not cover."

What Anthropic's data-handling contract actually says

The Anthropic data-retention model has two tiers that matter here.

Standard tier (no ZDR)

For organizations on the standard commercial API, Anthropic automatically deletes inputs and outputs from its backend within 30 days of receipt or generation. The 30-day window is the floor, not the ceiling. Exceptions: longer retention for features with customer-controlled retention (e.g., Files API with customer-set TTL), longer retention under a separate written agreement, and indefinite retention of User Safety classifier results -- not the original content, but the classifier scores -- for usage-policy enforcement.

Zero Data Retention (ZDR)

ZDR is an enterprise contract addendum, enabled per-organization by Anthropic's account team. Under ZDR, Anthropic does not store inputs or outputs at rest after the API response is returned, except where law or misuse-combat requires retention. ZDR applies to eligible Anthropic APIs and to Anthropic products that use the customer's commercial organization API key, which includes Claude Code. ZDR is not default; it requires explicit setup. Even under ZDR, Anthropic retains User Safety classifier metadata.

Microsoft, given the scale of the program and the type of IP being exposed, almost certainly contracted for ZDR. The public-record question for the audit is not "did Microsoft sign a ZDR addendum" -- they did -- but "was the ZDR-enabled organization key actually used for every Claude Code session in scope, with no developer-side fallback path to a non-ZDR account."

The four exposure surfaces an audit has to cover

1. Anthropic backend (covered by ZDR if correctly enrolled)

Under a properly-applied ZDR contract, the Anthropic backend stores nothing past the API response. The audit verification is contractual + procedural: get a vendor attestation that every API key used during the pilot was enrolled in ZDR for the full window, with timestamped enrollment proof. Most enterprise vendor relationships will produce this on request. The procedural concern is shadow-key creation -- developers who, for any reason, generated a key off a non-enterprise account and used it for actual work.

2. Anthropic User Safety classifier metadata (retained even under ZDR)

This is the surface the audit usually under-covers. Anthropic retains classifier scores -- not the original content, but the trust-and-safety signals derived from it -- indefinitely for usage-policy enforcement. The classifier-metadata exposure is small in volume but non-zero in signal: a classifier score saying "this organization triggered the malware-related classifier 47 times" is itself sensitive intelligence about what kinds of code Microsoft engineers were having Claude help with. The audit should formally request the classifier-metadata summary that Anthropic retains under the pilot, document it, and make a defensible call about whether to negotiate its scoped deletion.

3. Local Claude Code client transcripts at ~/.claude/projects/ (NOT covered by ZDR)

This is the surface most internal audits miss. Claude Code clients write session transcripts to local disk at ~/.claude/projects/ in plaintext JSONL format. The client-side retention window is 30 days rolling. The transcripts include the full prompt history, the agent-injected repository context, every tool call with full arguments, and every assistant response.

For a six-month pilot, every engineer who used Claude Code has at any snapshot up to 30 days of detailed working-context transcripts on their work laptop. After the pilot ends, those transcripts persist locally for the next 30 days as the rolling window expires. Crucially, backup snapshots taken during the pilot window retain those transcripts indefinitely, regardless of the rolling local-window expiration.

The audit-side work here is substantial:

4. Lateral exposure through engineers' personal accounts

The pilot's official API key path had ZDR. The audit question is what happens to engineers who, for any reason, used Claude.ai (the consumer-facing chat product) on personal accounts during the pilot window. Consumer Claude.ai has a different data-handling regime, including separate consumer-terms retention windows and -- absent specific opt-out -- training-data eligibility. Engineers who pasted internal code into Claude.ai consumer chat are a separate exposure category. The Samsung 2023 incident was structurally this category, multiplied across three employees in 20 days.

The audit signal here is mostly indirect. The org can: (a) survey engineers post-cancellation about consumer-tier usage during the pilot, (b) sweep work laptops for browser bookmarks/cookies to claude.ai signing in with personal-email accounts, and (c) revoke / advise on retention-clearing for any consumer accounts detected.

The six-control audit checklist

Stripped to operational items, the post-cancellation audit covers six categories. Each one has a deliverable, an evidence trail, and a residual-risk rating to feed into the internal AI-risk register.

Control 1: ZDR scope attestation

Deliverable: written attestation from Anthropic confirming every API key issued under Microsoft's commercial organization during the pilot was ZDR-enabled, with the date range of ZDR coverage. Evidence: the attestation document plus the internal key-inventory snapshot from the same window. Residual risk: medium -- the attestation is reliant on Anthropic's internal account records being complete. Get it in writing.

Control 2: Key deprovisioning

Deliverable: every API key issued during the pilot is deprovisioned post-June 30, with timestamped revocation. Evidence: vendor revocation log + internal SIEM correlation showing zero post-revocation API calls. Residual risk: low -- this is mechanical and easily verified.

Control 3: Local-disk transcript sweep

Deliverable: every issued laptop sweeps ~/.claude/projects/ for plaintext transcripts. Files inventoried, hashed, and DLP-scanned for credential-shaped content. Credentials detected are rotated. Transcripts are either deleted or retained under the org's standard secure-data-handling policy. Evidence: per-laptop sweep report + credential-rotation log. Residual risk: medium -- depends on completeness of laptop inventory and on how aggressively the rolling-window window has been allowed to drift before the sweep.

Control 4: Backup-snapshot rotation

Deliverable: any laptop-backup snapshots taken during the pilot window are inventoried. Snapshots containing ~/.claude/projects/ data are processed under the same credential-rotation discipline as Control 3. Snapshots are either re-taken post-cleanup or retained-with-documentation in the org's compliance vault. Evidence: backup-snapshot inventory + handling decision per snapshot. Residual risk: high if backups are not actively managed; medium otherwise.

Control 5: Classifier-metadata documentation

Deliverable: Anthropic-side User Safety classifier metadata retained under the pilot is requested, documented, and entered into the AI-risk register. The audit makes a documented call on whether to negotiate scoped deletion (typically not feasible) or to accept-and-document the retention. Evidence: Anthropic's classifier-metadata summary + the org's risk-register entry. Residual risk: low to medium -- the signal is small in volume.

Control 6: Consumer-tier usage survey

Deliverable: a survey of engineers covering whether and how often they used Claude.ai consumer-tier during the pilot. Combined with a laptop browser-history sweep where allowed. Detected consumer-account usage is flagged for advisory action (retention-clear on the consumer account, secret-rotation if any credentials were involved). Evidence: survey response set + browser-history sweep results. Residual risk: medium to high -- consumer-tier exposure is the part of the data flow least under the org's control.

What the Samsung 2023 incident does and does not teach

The Samsung ChatGPT incident in March 2023 is the most-cited comparable. Three incidents in 20 days: an engineer pasted faulty source code for a fix, another pasted equipment-defect-identification code for optimization, a third pasted a meeting transcript for summarization. Samsung banned all generative AI immediately. JPMorgan Chase, Apple, Verizon, Deutsche Bank, Goldman Sachs, and Citigroup followed with their own restrictions during 2023.

The Microsoft Claude Code situation is structurally different in important ways.

Sanctioned vs unsanctioned. Samsung was unsanctioned consumer-tier use; Microsoft was sanctioned enterprise-tier use under contract. The data-handling profile is materially different: Microsoft had ZDR; Samsung had consumer Terms of Service.

Scale. Samsung was three incidents involving distinct employees; Microsoft is six months of broad adoption involving hundreds to low thousands of engineers. The cumulative data flow is several orders of magnitude larger.

Trust posture. Samsung's mid-incident response was reactive: ban everything. Microsoft's June 30 sunset is proactive, driven by the budget rather than by a security incident, which means the security team has time to run a methodical audit rather than scrambling for damage control.

The paradoxical part is that the ZDR contractual protection has historically made internal security teams less rigorous about the local-disk and backup-snapshot exposure that the contract does not cover. The Samsung incident is sometimes invoked to argue "we are safe because we have ZDR." The Samsung incident was 100% unsanctioned-tier-only. ZDR has no bearing on local-client transcript exposure. The two incidents are not directly comparable, and a clean Microsoft audit closes them off as separate risk categories.

What the rest of the industry should be doing

Microsoft's Claude Code sunset is also a signal event for every other engineering org that adopted aggressive coding-agent tooling during the 2024-2026 ramp. Three pieces of advice apply regardless of which vendor the org is on.

First, treat ZDR as necessary but not sufficient. ZDR addresses one surface (vendor backend) out of four (vendor backend, vendor safety-classifier metadata, local client, local backup). The org's AI-risk register should track all four.

Second, document the local-client transcript story at adoption time, not at offboarding time. Every coding agent worth adopting has some form of local session persistence -- Claude Code's ~/.claude/projects/, Cursor's per-project history, Windsurf's local cache. The DLP and backup-handling policies for those directories should be defined when the agent rolls out, not when it gets sunset.

Third, plan the offboarding controls before signing the onboarding contract. The six-control checklist in this article should be in the security team's vendor-onboarding template, with the deliverable for each control specified in advance. The reason orgs scramble at offboarding is that the controls were not pre-defined; the offboarding scope is reconstructed from incomplete telemetry rather than executed against a checklist.

The Microsoft sunset will be cited in every AI-vendor-risk conversation for the rest of 2026. The shape of that citation will depend on what the post-cancellation audit produces. Done right, it becomes the playbook other orgs adopt. Done wrong, it becomes the cautionary example that every internal AI-risk report opens with for the next three years.

Related reading