Introducing Chainguard Labs: An update on an open, living software supply chain compromises dataset and new SBOM research efforts

John Speed Meyers and Zachary Newman, Principal Research Scientists
  •  
December 21, 2022

As part of our mission to make the software supply chain secure by default, understanding our vast software security ecosystem, its weaknesses, threats, opportunities and triumphs is critical in order to make meaningful progress. Today we are announcing Chainguard Labs, a dedicated team of researchers, academic partners and experts from the open source software community who will analyze software supply chain security, OSS security trends, and software security best practices to understand their collective impacts and benefits on the global software supply chain. The team will be led by John Speed Meyers, principal security scientist at Chainguard, Zachary Newman, academic researcher and software engineer at Chainguard and many others from the company and across the industry with expertise in Software Bills of Material (SBOM) projects, OSS security and software development practices. 

New research uncovering our software dark matter universe and the effectiveness of OSS SBOMs 

Today, Chainguard Labs released new research on SBOM quality. For this preliminary assessment of SBOM quality, the team used the bom-shelter dataset of 50+ SBOMs drawn from open source software projects and the application of two SBOM quality tools to these SBOM documents. While the dataset is small because in-the-wild SBOMs aren’t yet common, it's our hope that others point out or donate open source project SBOMs.

The main results include:

  • Some open source project SBOMs are low quality. For instance, when applying the SBOM Scorecard tool, nearly four-fifths of the SBOMs lacked package license information and two-fifths lacked any package version information.
  • None of the SBOMs conformed to the standards of the National Telecommunications and Information Administration's (NTIA) “minimum elements” framework. The SPDX community’s NTIA Conformance Checker tool, when applied to this SBOM dataset, revealed that the minimum elements appear to be a high bar.
  • Some open source projects do have high quality SBOMs. Several SBOMs investigated contained a wide variety of helpful information, especially package IDs (via PURL or CPEs), package versions, and licenses.

Read more in the full blog post here

This new SBOM quality research comes after the team looked at another novel concept called software dark matter. Much like regular dark matter, software dark matter comprises packages that exist but which are effectively unseen, software that is untracked by typical tools like a package manager or a SBOM. According to Chainguard Labs’ estimates examining several hundred popular open source software containers, software dark matter constitutes 32 percent of analyzed containers. The team performed an analysis to quantify the percentage of files within 350 popular open source software containers that are software dark matter. The analysis used a tool that we wrote and open-sourced, darkfiles, for measuring software dark matter. The goal with this research is to ultimately eliminate software dark matter and build more software transparency developers, consumers and organizations can rely on. 

A software supply chain security compromises dataset and breakdown of different attack types

The log4j vulnerability and SolarWinds were evidence that the global software supply chain is in the midst of a security crisis, but beyond that these two cases shared few similarities. This insight, in fact, applies to many so-called software supply chain compromises. The convoluted nature of the modern software supply chain has opened Pandora’s box: there is a wide variety of compromises that can be labeled as software supply chain and both the types and sheer number of these attacks appears to be growing. To methodically capture the onslaught of software supply chain compromises, the Chainguard Labs team maintains and contributes to open source datasets that catalog these attacks.

The team co-created and now maintains one dataset of malicious compromises of the software supply chain, coincidentally published at the time of the SolarWinds hack. This effort revealed nine major categories of attacks on the software supply chain and document, depending on the methodology, hundreds or thousands of known compromises.

Additionally, the team is involved in a nascent cross-company effort to collect compromises of open source software specifically. Under the umbrella of the Open Source Security Foundation, this dataset will help those interested in understanding and combating insecurities in the open source software supply chain.

Throughout the course of 2023, Chainguard Labs will publish original research reports and analysis in partnership with the academic and open source software communities on a variety of software security and OSS topics, including emerging threats, adoption trends of software security best practices and more. To follow along with the team’s research and recommendations, or to share a related topic you’d like the team to dig into, check out our research blog page, website or sign up for Chainmail, our monthly newsletter to get the latest delivered to your inbox. 

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Don’t break the chain – secure your supply chain today!

News

Introducing Chainguard Labs: An update on an open, living software supply chain compromises dataset and new SBOM research efforts

John Speed Meyers and Zachary Newman, Principal Research Scientists
December 21, 2022
copied

As part of our mission to make the software supply chain secure by default, understanding our vast software security ecosystem, its weaknesses, threats, opportunities and triumphs is critical in order to make meaningful progress. Today we are announcing Chainguard Labs, a dedicated team of researchers, academic partners and experts from the open source software community who will analyze software supply chain security, OSS security trends, and software security best practices to understand their collective impacts and benefits on the global software supply chain. The team will be led by John Speed Meyers, principal security scientist at Chainguard, Zachary Newman, academic researcher and software engineer at Chainguard and many others from the company and across the industry with expertise in Software Bills of Material (SBOM) projects, OSS security and software development practices. 

New research uncovering our software dark matter universe and the effectiveness of OSS SBOMs 

Today, Chainguard Labs released new research on SBOM quality. For this preliminary assessment of SBOM quality, the team used the bom-shelter dataset of 50+ SBOMs drawn from open source software projects and the application of two SBOM quality tools to these SBOM documents. While the dataset is small because in-the-wild SBOMs aren’t yet common, it's our hope that others point out or donate open source project SBOMs.

The main results include:

  • Some open source project SBOMs are low quality. For instance, when applying the SBOM Scorecard tool, nearly four-fifths of the SBOMs lacked package license information and two-fifths lacked any package version information.
  • None of the SBOMs conformed to the standards of the National Telecommunications and Information Administration's (NTIA) “minimum elements” framework. The SPDX community’s NTIA Conformance Checker tool, when applied to this SBOM dataset, revealed that the minimum elements appear to be a high bar.
  • Some open source projects do have high quality SBOMs. Several SBOMs investigated contained a wide variety of helpful information, especially package IDs (via PURL or CPEs), package versions, and licenses.

Read more in the full blog post here

This new SBOM quality research comes after the team looked at another novel concept called software dark matter. Much like regular dark matter, software dark matter comprises packages that exist but which are effectively unseen, software that is untracked by typical tools like a package manager or a SBOM. According to Chainguard Labs’ estimates examining several hundred popular open source software containers, software dark matter constitutes 32 percent of analyzed containers. The team performed an analysis to quantify the percentage of files within 350 popular open source software containers that are software dark matter. The analysis used a tool that we wrote and open-sourced, darkfiles, for measuring software dark matter. The goal with this research is to ultimately eliminate software dark matter and build more software transparency developers, consumers and organizations can rely on. 

A software supply chain security compromises dataset and breakdown of different attack types

The log4j vulnerability and SolarWinds were evidence that the global software supply chain is in the midst of a security crisis, but beyond that these two cases shared few similarities. This insight, in fact, applies to many so-called software supply chain compromises. The convoluted nature of the modern software supply chain has opened Pandora’s box: there is a wide variety of compromises that can be labeled as software supply chain and both the types and sheer number of these attacks appears to be growing. To methodically capture the onslaught of software supply chain compromises, the Chainguard Labs team maintains and contributes to open source datasets that catalog these attacks.

The team co-created and now maintains one dataset of malicious compromises of the software supply chain, coincidentally published at the time of the SolarWinds hack. This effort revealed nine major categories of attacks on the software supply chain and document, depending on the methodology, hundreds or thousands of known compromises.

Additionally, the team is involved in a nascent cross-company effort to collect compromises of open source software specifically. Under the umbrella of the Open Source Security Foundation, this dataset will help those interested in understanding and combating insecurities in the open source software supply chain.

Throughout the course of 2023, Chainguard Labs will publish original research reports and analysis in partnership with the academic and open source software communities on a variety of software security and OSS topics, including emerging threats, adoption trends of software security best practices and more. To follow along with the team’s research and recommendations, or to share a related topic you’d like the team to dig into, check out our research blog page, website or sign up for Chainmail, our monthly newsletter to get the latest delivered to your inbox. 

Related articles

Ready to lock down your supply chain?

Talk to our customer obsessed, community-driven team.