AUSTA | Adversarial Intelligence

Supply-Chain Security

Slopsquatting: How Package Hallucinations Become a Supply-Chain Attack (2026)

Typosquatting waits for a human to fat-finger a package name. Slopsquatting waits for a language model to invent one. The difference is that humans typo at random, and models hallucinate the same wrong names again and again, which turns a confidently incorrect answer into a registrable, predictable attack target.

By Austa · Published · ~9 min read

The output that becomes an exploit

Ask a coding assistant to write a function and it will often suggest the dependencies to install alongside it. Most of the time the package exists. Sometimes it does not. The model produces a name that looks exactly right, follows the ecosystem's naming conventions, and reads as the obvious package for the job, but no such package was ever published. This is a package hallucination: a fact-conflicting output where the model invents a dependency that has no entry on any registry.

On its own, a hallucinated name is a broken suggestion. pip install returns an error, the developer notices, and nothing happens. The danger appears when an attacker gets there first. If they register the hallucinated name on PyPI or npm and attach malicious code, the install command no longer fails. It succeeds, and it runs attacker-controlled code on the developer's machine or, increasingly, inside an autonomous agent's sandbox. Seth Larson, security developer-in-residence at the Python Software Foundation, coined the term slopsquatting for this, blending "AI slop" with "typosquatting." It is documented in the Wikipedia entry on slopsquatting and was covered in depth by CSO Online.

How often models invent packages

The foundational measurement comes from "We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs" by Spracklen, Wijewickrama, Sakib, Maiti, Viswanath, and Jadliwala, presented at USENIX Security 2025. Across 576,000 code samples generated by 16 different LLMs and two prompt datasets, roughly 19.7% of the recommended packages were hallucinations. That is more than 205,000 unique non-existent package names produced by the cohort.

The rate was not uniform. Open-source models hallucinated far more often, around 21.7% on average, than commercial models at about 5.2%. The worst performers invented a package in more than a third of relevant outputs; the best stayed under 4%. The headline number depends heavily on which model is in front of the developer.

The asymmetry in one line: a typo is a one-off accident, but a hallucination is a reproducible behavior of a deterministic-enough system. The attacker does not have to predict your mistake. They can sample the model and watch it predict its own.

Why this beats typosquatting: the hallucinations repeat

The detail that turns a quality problem into a security problem is repeatability. When the Spracklen team re-ran 500 prompts ten times each, 43% of hallucinated names reappeared in all ten runs, and 58% appeared in more than one run. These are not noise. They are stable attractors in the model's output distribution. An attacker can sample a popular model a handful of times, collect the names it reliably invents, and register them before any legitimate maintainer claims them.

It is also a cross-model phenomenon, which widens the blast radius. A 2026 re-evaluation, "The Range Shrinks, the Threat Remains", found that newer frontier models hallucinate less in absolute terms, with per-model rates in the rough range of 4.6% to 6.1%, but identified a set of 127 package names that five different evaluated models all invented identically (109 on PyPI, 18 on npm). A name that several independent models converge on is the highest-value target an attacker can register, because it catches users of more than one assistant. The threat shrank in rate but did not disappear, and it gained a shared surface that single-model studies miss.

It has already happened in the wild

This is not a purely theoretical worst case. The most-cited proof of concept is huggingface-cli. The real Hugging Face command-line tool installs as part of huggingface_hub[cli], but models kept suggesting the shorter, non-existent huggingface-cli. A researcher published an empty placeholder package under that hallucinated name to observe what would happen, and it accumulated tens of thousands of installs, with at least one large organization having pasted the hallucinated install command directly into a public repository's documentation. Had the placeholder been malicious, every one of those installs would have run attacker code. The Trend Micro write-up walks through this and several related cases.

The agentic version is worse, because no human is in the loop to catch the bad name. When a coding agent with shell access decides on its own to install a dependency and run it, a hallucinated name flows straight from generation to execution. That is the same excessive-agency failure mode we describe in the threat model for coding agents with shell access: the agent treats its own output as ground truth and acts on it with real system privileges.

Where slopsquatting sits among the data-poisoning attacks

Slopsquatting is part of a broader pattern where the integrity of the data an AI system depends on, rather than the model or the prompt, is the attack surface. It is worth placing it precisely against its neighbors, because the defenses differ.

In the OWASP framing, a hallucinated package name is a Misinformation output (LLM09), an installed compromised dependency is a supply-chain risk, and an agent that installs it unprompted is exercising excessive agency. The full mapping lives in our walkthrough of the OWASP Top 10 for AI agents.

How to pentest a coding agent for slopsquatting

The point of a test is not to confirm that models hallucinate; that is established. The point is to find out whether a hallucinated name in your workflow can reach an install command and then execution. Measure each stage separately so you know which gate is missing.

1. Provoke and catalogue the hallucinations

Drive your assistant or agent with realistic build prompts, especially ones that ask for niche or trendy functionality where the model is most likely to improvise a dependency. Run each prompt several times. Collect every package name it emits and diff that list against the real registries. The names that come back missing, and especially the ones that recur across runs, are your candidate targets, exactly the set an attacker would harvest.

2. Test the path from name to install

Take a hallucinated name and ask whether your pipeline would actually install it. Does the agent run pip install or npm install on its own output without checking the name exists or is trusted? Does a human ever see the dependency list before it is added? Register a harmless internal canary package under a hallucinated name in a private or test registry, never a real public registry, and see whether the agent pulls and executes it. If it does, you have confirmed the full chain end to end.

3. Probe the resolution and confusion edges

Check what happens when a hallucinated name collides with your internal naming. If the model invents a name that looks like one of your private packages and that name is unclaimed on the public registry, your resolver may reach out to the public one. This is where slopsquatting meets dependency confusion, and it is worth testing explicitly rather than assuming your registry configuration closes it.

4. Verify your guardrails actually fire

If you have an allow-list, an age-and-download threshold, or a lockfile gate, attempt to install a freshly created low-reputation package through the agent and confirm the control blocks it. A guardrail that has never been exercised against a real attempt is an assumption, not a defense. For the broader audit structure this fits into, see our 2026 LLM security checklist.

What actually reduces the risk

No single control fixes this, because the model will keep producing plausible wrong names. Defense in depth is the right framing, layered from the install gate outward.

Never install straight from model output

The cheapest attack to kill is the one where generated text becomes an install command with nothing in between. Insert a verification step that checks every dependency against an allow-list or a vetted lockfile before installation. For autonomous agents, adding a new package should require a policy check or human approval, not happen as a silent side effect of code generation.

Pin, hash, and lock

Pinned versions with hash-locked dependencies mean an install resolves only to the exact artifact you already vetted. A hallucinated name has no entry in your lockfile, so the install fails closed instead of reaching out to whatever an attacker just registered. This is standard supply-chain hygiene that happens to neutralize the slopsquatting chain.

Gate on package reputation and age

Slopsquatted packages are, by necessity, new and low-trust at the moment of attack: the attacker had to register them after observing the hallucination. Blocking or quarantining packages below an age and download threshold buys time for detection and removal before your build pulls a brand-new phantom. Several scanners now flag exactly this profile.

Treat package names as untrusted model output

The root cause is the same instruction-versus-data confusion that runs through every LLM integrity attack: a name the model emits is a prediction, not a verified fact, and the system downstream must treat it that way. A model that hallucinates a package is doing the same thing as a model that hallucinates a citation or an API endpoint. The fix is to verify against an authoritative source before any consequential action, and to keep a human or a hard policy in the loop wherever the action installs and runs code.

The takeaway for a 2026 audit

If your pentest stops at the prompt and the model's chat responses, it misses the place where a confidently wrong output turns into running code. The chain is short and well documented: the model invents a package name, it invents the same name repeatedly and across vendors, an attacker registers it, and an install command, increasingly issued by an agent with no human watching, executes attacker code. The hallucination rate is lower on 2026 frontier models than it was, but 127 names that five models all invent identically is not a rounding error. It is a pre-built target list.

Add slopsquatting to the test plan alongside the prompt-layer and corpus work. Provoke the hallucinations, follow one all the way to an install attempt, and confirm that a verification gate stands between your agent's output and your package manager. For the surrounding methodology, see the LLM security checklist for 2026 and the OWASP Top 10 for AI agents mapping, both of which treat the data and dependencies feeding your model as part of the attack surface.

Related