Engineering

Reflections on trusting VEX (or when humans can improve SBOMs)

Adolfo García Veytia, Staff OSS Engineer
November 23, 2022
copied

Software bills of materials (SBOMs) are one of the rising stars of supply chain security. One way of understanding them is to picture SBOMs as a list of all the components used to build a program. The promise of SBOM is to grant us full transparency into the composition of the software we are using to let us know all the ingredients that went in the kitchen when it was baked.

The extreme transparency enabled by SBOMs can come with a downside; however, false positives when running a security scanner against it can easily come up as we are listing EVERYTHING, regardless of how components are used. If a piece of software includes a component, say a library, known to have a vulnerability in the linked version, a scanner may report the whole application as vulnerable, even if the component is not used.

To curb the noise of false positives when scanning an SBOM, software publishers can turn them off using VEX. But how can you trust a statement that turns off security alerts?

What is VEX? And What Are Its Benefits? 

A piece of software can contain a component with a vulnerability yet not be vulnerable itself. This can, for instance, be the result of certain configuration settings that render the vulnerability inapplicable. This fact is the motivation behind VEX, which is a DHS initiative under Cybersecurity & Infrastructure Security Agency (CISA). Vulnerability Exploitability eXchange (VEX) is a data format that lets upstream software producers inform downstream software consumers whether a given vulnerability affects the software application in question.

With VEX, a human can let other people (and security scanners) know that a particular vulnerability affects a piece of software. VEX can also reverse a previous non-impact statement when applicable if, for example, a new library is linked or software is added to a project. Experts assessing impact can capture their findings, along with their reasoning, in a machine-readable format that tools down the stream can consume to make a final call on whether or not they should trust software.

A VEX document contains one or more claims in the following form:

-- CODE language-bash --${VULNERABILITY X {AFFECTS|DOES NOT AFFECT} SOFTWARE Y [BECAUSE OF Z]}

That triple (vuln+impact+software) is known as an impact statement in VEX lingo and is the key piece of data that lets us avoid the extremes where SBOM transparency can take us.

Trusting VEX

Being able to get our hands on those VEX documents does not automatically imply we can trust them, of course. In the VEX community, the issue of trust has been brought forward many times, and rightfully so. There are many sides to trusting VEX worth exploring.

Building trust in a VEX document and the impact statements can be understood as being able to answer the following questions:

  • Who is making these statements? Is this person (or tool) authoritative enough to make these claims? Can I verify that this person (or tool) is related to the software it is assessing?
  • Has this document been modified after its creation?
  • Can I verify the VEX impact statements captured in the doc?

These questions to build trust can be fused together into two concrete requirements.

The Role of the Author

The first issue to think about when evaluating a VEX document is the role of the person making an impact statement. If someone tells you that some software is not impacted by CVE-2022-WHATEVER, would you rather trust a software engineer working on the project or some random person making that claim? That one is easy, but when claiming that software IS impacted, would you rather choose said engineer or the security researcher that found the CVE? Both perhaps? 

The role of the person making the claim is really important when trying to build trust. Unfortunately, roles, expertise, and other human aspects cannot easily be verified by machines. Building trust in the impact assessment boils down to the issue of trusting the good judgment and good faith of whoever is making an impact claim. I would never doubt my grandmother is well meaning if she told me some software isn’t hiding the whole Trojan army, but I certainly have doubts about her ability to make that claim.

This one is really hard. In a perfect world, we would be able to programmatically check that the assessment is real. That would be the ultimate proof: if you tell me that software is not impacted and I can verify your claim, all other issues when building trust become less important or even irrelevant. If I can’t verify them, I need to know you are an authority on the subject to trust you.

The Authenticity of the Claim

Another challenge when trying to assess the trustworthiness of a VEX document is the authenticity of the identity making a claim. Just as a wolf can wear sheep’s clothing, an adversary can lie about their relationship with a project. If someone tells you that a piece of software is not impacted by a CVE, how can you make sure that that person is not trying to deceive you into deploying a vulnerable version only to find out the next morning that you have been rooted?

That is why signing VEX is important. It makes the claim non-falsifiable. But keep in mind that signing only takes you so far. As explained before, to trust a signature you still need to know that whoever signed an impact statement is qualified to make the claim. In other words, you need to connect the identity to the role. To build trust we need a little bit more than a candid “trust me, it’s me.”

Building Trust to Enforce Policy

To write and enact policy based on VEX, those previous questions need to be answered with confidence. Some of them may have more than one answer. And those answers can - and most likely will - reside outside of the VEX document.

I don’t think any mix of technologies can make VEX fully automated, and that is by design (see our final thoughts below). Nevertheless, many mechanisms can give us somewhat satisfactory answers to the questions required to build trust in VEX. 

To understand what technologies can help to find the answers we need, we can split them into two categories: Formats encapsulating VEX data and external tools and frameworks.

Encapsulating VEX

VEX is designed to be a standalone document. Formatted in, for instance, JSON, a VEX document needs to contain only the author’s information, vulnerability data, a way to identify the software in question, and the impact statement.

Sometimes, however, VEX data will be embedded in another document format. When VEX is embedded, the encapsulating format can supply some of the data bits that VEX needs to complete its message. Let’s think about some examples:

  • CSAF, for example, has a rich expression syntax to specify product families and versions which can be addressed by VEX in one statement.
  • CycloneDX and SPDX can be used to link VEX data to artifacts or components specified in an SBOM.
  • At Chainguard, we are studying how VEX can live inside in-toto attestations which are perfectly suited to make claims about software, such as VEX impact statements (more on this below).

Embedding VEX lets us leverage more expressive formats to make an impact statement more useful while keeping the document small and compact. Also, by leveraging these formats we can make use of existing tools and processes to produce them that are already in place.

External Tools and Frameworks

To answer some of the more difficult questions, we need more than just a data format. Sigstore can be used today to sign VEX and associate an identity with impact statements. Using its transparency log we can get an auditable record to understand the evolution of impact data and make it more discoverable. Other proposed technologies like GUAC and SCITT will be able to help here too. 

Beyond discoverability, we should note that in-toto is a great companion to VEX. In-toto lets us achieve two important goals when seeking the answers we need.

First, it enables us to sign VEX data by wrapping it in attestations ensuring it can’t be modified. Another benefit is that a signed attestation defines a set of subjects (the software we are talking about) while keeping the impact statement separate in the predicate. This means we can reuse the statement when we need to. Think about this when you need to reuse a container image as a base. 

Secondly, in-toto defines the notion of a functionary. Functionaries are identities authorized to perform a role in the software supply chain. Some examples of what functionaries can do include building, signing, or running security scans of software. Once their respective task is done, functionaries send back a signed proof of their work and the resulting data. By defining who can be a VEX functionary, in-toto can help answer some of the hardest questions in VEX trust: “Is this person (or tool) authoritative enough to make these claims?” or, put in another way, a software project can specify who is qualified enough to make impact statements about it and any other could be safely ignored.

The Human Angle

Finally, there is the question of verifying VEX impact statements. Unlike other software supply chain techniques designed for verification, such as reproducible builds, VEX is designed to capture the human experience. VEX comes into play when we need to signal machines that humans know something they don’t. Therefore, VEX impact statements can't always be backed up by automated checks.

While we may be able to provide machine verifiable proof of the claims we are making, sometimes it may be impractical or outright impossible. Since VEX is designed to capture an expert’s point of view, I think not being able to verify impact statements should not erode trust in VEX.

Some Final Thoughts

The beauty of VEX, and probably its usefulness too, lie in the simplicity of its three-part assertions. We of course need to trust VEX but, as we’ve shown, other projects in the supply chain ecosystem can make VEX more useful and trustworthy by encapsulating it and/or linking it to complimentary metadata. While all the questions are valid, there are tons of great ideas which can augment VEX without the need to prescribe how trust should work in the specification.

A simple JSON data structure is all we need to VEX, and that simplicity can become super reliable when paired with the right tools.

Related articles

Ready to lock down your supply chain?

Talk to our customer obsessed, community-driven team.