Mitigating Malware in the npm Ecosystem with Chainguard Libraries

Derek Garcia, Research Assistant, Charlie Robbins, Principal Software Engineer, and Manfred Moser, Senior Principal Developer Relations Engineer

TL;DR


  • Using a known malicious npm package dataset, Chainguard Libraries prevented ~99.7 percent of malicious packages from being published by building from source.

  • About 95 percent of the known malicious packages do not have a valid attributed source to build from.

  • Roughly 5 percent of the known malicious packages have valid sources that do not match what is contained in the package published in the npm registry.


Overview


The npm Registry is the largest open-source package repository, hosting just almost three and a half million packages, twice the number of packages in Maven Central and the Python Package Index (PyPI) combined. The npm registry provides a massive attack surface, making it an enticing target for bad actors eager to exploit the software supply chain. High profile attacks like event-stream and chalk repeatedly demonstrate that the risk is not going away. Our research finds that as of June 2025, 13 percent of npm packages in npm had a “security holding” – a special dummy package that acts as a placeholder for suspected malicious packages. Furthermore, Sonatype identified over half a million malicious npm packages in 2024 alone, making up 98.5 percent of all malicious packages identified that year.


Chainguard Libraries for JavaScript was introduced recently to tackle reducing malware attacks and securing the npm supply chain. We analyzed ~8,783 unique malicious npm packages to determine what percentage of malicious npm packages our future customers can expect to avoid. . The results below indicate that ~99.7 percent of these malicious packages are blocked for users who rely on Chainguard Libraries as their sole source for npm dependencies.


The majority of malicious packages are blocked with Chainguard Libraries for JavaScript due to the lack of an attributable source: a URL to source code such as a GitHub repository. npm packages have the option to list a registry in their package.json file, but this isn’t required and bad actors understandably don’t like to leave tracks. Additionally, by requiring a source, any compromise of the package within the npm ecosystem, such as the recent thefts of maintainer credentials, would be circumvented by building from source and bypassing the ecosystem entirely.


Methodology & Results


We sourced malicious npm packages from the following reputable malicious open source software package datasets: Backstabber’s Knife Collection, DataDog’s malicious package dataset, and MalOSS. The Backstabber’s Knife Collection dataset and MalOSS are created and maintained by academic researchers. DataDog’s malicious package dataset includes malicious packages detected by their GuardDog tool. Removing duplicate packages resulted in a sample of 8,783 packages for this experiment.


Figure 1: Decision tree for npm malware detection

With our sample, we attempted to build these packages from source with our tooling from Chainguard Libraries for JavaScript. We found that the majority of packages, about 91.3 percent, are not backed by trusted source coordinates. Since Chainguard Libraries require source code (a URL) to build each npm package, these malicious packages could not have been built. Furthermore, about 3.2 percent of packages did attribute a source that failed to resolve. In total, about 94.5 percent of malicious packages have an invalid reference to the source code so building from source cannot even be attempted.


An attributed source reference that resolves to actual source code is a good start, however malicious packages are rarely truthful about their source. Publishers package their source code into an artifact before publishing to npm instead of publishing the source code directly. This small step can introduce discrepancies between the published source code and what is ultimately published as a package to the npm registry, creating an attack surface for bad actors. Attackers can either claim a reputable source is the source of their malicious package or even go one step further and clone a project verbatim, then modify the code with their malicious payload.


In both cases, the artifact that is published to the npm registry does not match the attributed source, which occurred in about 5.2 percent of the packages. Chainguard Libraries are built from source, removing this attack surface entirely, and, as a result, prevent these malicious imitation packages. Combined with the packages with invalid sources, 99.7 percent of malicious packages are not published. The remaining malicious packages have an attributed source that resolves and the artifact matches that source. Chainguard Libraries does not yet prevent these very rare (less than one percent) malicious packages from being published. However, truly malicious packages are in the minority as this category also includes proof-of-concept packages, where the source is disclosed as malicious through documentation. Our source code scanning prototypes are promising to prevent these rare packages soon.


Summary



Overall, we found that nearly all malicious npm packages published to the npm registry are prevented from impacting users of Chainguard Libraries for JavaScript. Over 99% of malicious packages are blocked, which means that an even higher percentage of all packages is secured.


Building the packages from source in a secure environment proves to be an effective means to secure the supply chain of npm packages.


Chainguard Libraries for JavaScript is currently available in closed beta. Sign up today.

Share

Ready to Lock Down Your Supply Chain?

Talk to our customer obsessed, community-driven team.

Talk to an expert