21. Mai 2025

Guarding the Python Ecosystem Against the Growing Number of Severe Malware Attacks

Patrick Smyth, Staff Developer Relations Engineer

When Python developers look to solve complex software problems for their organizations, they turn first to pre-built software libraries available in public repositories like the Python Package Index (PyPI). This public repository has supported scaling and innovation with the help of open source libraries. However, as enterprise reliance on open source libraries grows, so too does the frequency and severity of malware attacks. That’s because when developers integrate software from open source language ecosystems with a simple `pip install` into a business’s production codebase, they potentially open a vector for bad actors to exploit.

To combat the growing threat of malware in the Python ecosystem, Chainguard has developed Chainguard Libraries for Python, a more secure package index in which every library and its full dependency tree is built entirely from source in our hardened infrastructure. This approach allows Chainguard to secure the build and distribution links of the open source software supply chain and provide enterprises with a more secure, trusted source for language dependencies. Through our approach, Chainguard Libraries delivers trusted, malware-resistant libraries for the modern enterprise.

Public Repositories Depend on User Responsibility

The open nature of PyPI represents a valuable core component of open source software infrastructure. PyPI’s low barriers to entry enable developers and maintainers to readily contribute and consume packages, supporting innovation to the benefit of the broader ecosystem. Still, with few hurdles to jump, bad actors may exploit this important infrastructure for nefarious ends. With openness comes innovation, but also risks — the onus of validating the security and integrity of each individual package is on the user.

PyPI staff and community contributors have improved security on the repository through tools such as Trusted Publishing and the PyPI publish GitHub Action to reduce the risk of compromised API tokens and GitHub workflows. Other approaches, such as automated scanning for malware, have not been adopted on PyPI due to an unacceptably high false positive rate. Ultimately, PyPI is unable to bear responsibility for builds of packages hosted on the repository.

Manually reviewing the provenance and integrity for package builds and distribution is impractical given the sheer quantity of packages, and is an ongoing maintenance burden as versions and build systems change. That means enterprises building and delivering Python applications are still individually responsible for securing the full open source supply chain and protecting against the various points of failure shown below.

Ultralytics Supply Chain Attack

The challenges with packages across the supply chain came to a head in December 2024 when Ultralytics YOLO, a popular computer vision model, released two compromised library versions that ran cryptomining malware. Although the project used Trusted Publishing, these releases were pushed to PyPI via a hijacked GitHub Actions workflow and a compromised API key.

While PyPI’s infrastructure was not compromised, it is unable to make security guarantees. Following the attack, PyPI shared the following in its analysis:

“Not every package and release on PyPI should be treated as trusted, it is up to you the user to review your usage of software from PyPI before choosing to install packages. … PyPI staff and volunteers do their best to remove malware, but because the service is open to anyone looking to publish software there is an unfortunately high amount of abuse.”

The attack on Ultralytics represented a major shift for those building on Python because it became a well-known compromise of a CI workflow by a malicious outside actor. And the impact of this attack was real. The Ultralytics project itself sees 60M+ annual downloads and is present in 10% of enterprise cloud environments, which makes it an attractive threat vector for bad actors looking to hijack enterprises and their high value data. Similar attacks are likely under way.

Enter Chainguard Libraries

To combat the growing frequency and severity of malware attacks in the Python ecosystem, Chainguard has built an index of malware-resistant Python wheels entirely from source in our SLSA-hardened infrastructure. By building the most popular libraries from source along with their full dependency trees, Chainguard secures every stage of the software supply chain, from source ingestion and build to tests and distribution, while providing verifiable provenance for our artifacts. Chainguard substantially reduces the risk from threat vectors like hijacked build processes, tainted release pipelines, compromised distribution points, and malicious actors publishing packages without source code.

Chainguard has built over 13,000 Python libraries to date and is continuously adding to this catalog. Our approach offers the innovative advantages and other benefits of open source libraries while also making a more secure distribution channel.

To verify our malware-mitigation thesis, Chainguard analyzed 3k malicious Python packages sourced from the Backstabber’s Knife Collection, which has grown considerably since its initial release and is now considered one of the most well-vetted data sets of known malicious open source libraries. Our early results showed that 98% of these malicious libraries would have been avoided by enterprises relying on Chainguard Libraries as their sole source for Python dependencies.

Chainguard Libraries integrates seamlessly with your organization’s existing repository manager including JFrog Artifactory, Cloudsmith, and Sonatype Nexus, so that you can meet your developers where and how they work. Changing the reference point of your artifact manager to Chainguard Libraries can ensure that your developers are pulling secure, malware-resistant artifacts without disrupting their workflows.

Final Takeaways

A simple pip install from a developer can bring immense value to your organization, but integrating unvetted packages into your codebase introduces serious risks. While open platform repositories provide a public good and have taken steps toward a greater security posture, they don’t rebuild packages in a secure environment, nor do they actively scan for malware, vulnerabilities, and other supply chain attacks.

Chainguard Libraries provides a trusted source for packages, rebuilding every library and its full dependency tree from source in our hardened infrastructure. Adopting Chainguard Libraries for Python development and deployment represents an effective approach to mitigating malware without introducing friction for developers.

If you are interested in trying Chainguard Libraries, reach out today. And if you want to learn more about how the product works in practice, check out the recording of our recent Learning Lab and our comprehensive docs.

Share this article

Guarding the Python Ecosystem Against the Growing Number of Severe Malware Attacks

Public Repositories Depend on User Responsibility

Ultralytics Supply Chain Attack

Enter Chainguard Libraries

Final Takeaways

Want to learn more about Chainguard?