All articles

This Shit is Hard: keeping Chainguard OS lean, current, and secure — the power of garbage collection

James Page, Principal Software Engineer, and John Slack, Senior Product Manager

Chainguard’s “This Shit is Hard” series showcases the difficult engineering work we’ve tackled to deliver best-in-class outcomes for customers using our products. We’ve covered several important topics, including the Chainguard Factory, Chainguard Libraries for Java, our integrations with several scanner partners, recent hardening of glibc, our implementation of SLSA Level 3, and the concept of “Zero Trust” in open source software. Today, we’re showcasing how we keep Chainguard OS, the engine that powers all we do at Chainguard, lean, current, and secure.

At Chainguard, we're constantly innovating to provide the most secure open source software for developers. A critical, often unseen, aspect of this mission is "garbage collection" – the systematic removal of unneeded packages from our package repositories. This isn't just about tidiness; it's a fundamental security practice.

Why garbage collection is essential for security

The Chainguard Factory is a powerhouse, churning out thousands of automated package updates every week. Without a robust garbage collection process, we accumulated a long tail of older, unmaintained packages in Chainguard OS and Wolfi APK repositories. At its peak, Wolfi APK repositories contained over 300,000 packages, many of which were built years ago, contained known CVEs, and had been replaced by newer versions of the same packages.  While these lingering packages weren’t actively used within Chainguard container images, they significantly increased the potential security attack surface for our customers – many of whom use packages to customize images. Garbage collection ensures the package archive used to build and customize Chainguard container images remains lean, current, and, most importantly, secure.

Industrial-scale problem, automated solution

Managing hundreds of thousands of packages is an industrial-scale problem that simply cannot be handled manually. Packages have a complex dependency tree, and a naive approach to package removal could easily break container images or image customization workflows. That's why our dependency analysis is fully automated within the Factory. Our process begins with a time-based filter, targeting anything older than 12 months. These candidates are then rigorously checked against our container and VM images, as well as package build definitions, to ensure that they are not actively used in Chainguard products. This ensures we maintain consistency and reproducibility across the archive at all times.

We also leverage data to understand which packages our customers include in their customized images, and work with those customers to migrate to newer packages. These "guardrails" are crucial; they ensure that our garbage collection efforts don't inadvertently break anything downstream — instead, garbage collection keeps the archive secure and sustainable.

We’ve also worked to ensure these processes are reversible; Chainguard can restore any package within 60 days of removal.

Striking the right retention period

Chainguard rebuilds container images daily and operates with rolling releases. This dynamic environment requires a thoughtful approach to package history. Historically, we maintained a "forever history," which offered the ability to revert to any previous state but also resulted in an ever-growing collection of unused packages. Traditional distributions often take the opposite approach, replacing old versions completely, which can compromise reproducibility.

With Chainguard OS, we've adopted a balanced strategy: We maintain a 12-month history. This duration is long enough to reproduce any recent build bit-for-bit yet short enough to prevent the accumulation of endless stale packages. This approach instills confidence in a verifiable supply chain for our customers without the security burden of an ever-expanding archive.

For Wolfi, we have also started with a 12-month retention period, but we plan to reduce this to a 3-month retention period in the future.

Chainguard OS: your secure, compatible distro

Chainguard OS is the secure foundation that powers everything we do at Chainguard, built on a philosophy of minimalism and continuous renewal. These “garbage collection” principles are not just about keeping things clean; they’re a deliberate security mechanism. By automatically pruning unused or outdated software, Chainguard OS minimizes the potential attack surface and reduces your exposure to unpatched vulnerabilities. This disciplined approach ensures that only the most current, verified, and necessary components remain within the system.

The result is a lean operating system that’s easier to audit, faster to rebuild, and inherently more trustworthy. Because Chainguard maintains a 12-month rolling history, developers can still achieve perfect reproducibility for recent builds without inheriting years of stale, risky baggage. In essence, Chainguard OS applies the logic of “less is more” at industrial scale; transforming what could be a sprawling, complex ecosystem into a tightly managed, continuously verified base that enables secure, reproducible software supply chains.

If you are interested in learning more about Chainguard OS and its associated benefits, check out our “Chainguard Your OS” white paper and see how Chainguard OS can pave the way for security, efficiency, and productivity gains in your organization.

Share this article

Related articles

Want to learn more about Chainguard?