Reproducing Chainguard’s reproducible image builds

Matt Moore, CTO
  •  
July 5, 2023

At Chainguard, we have built our Images product with reproducibility as a bedrock principle. Reproducibility is not a simple thing to achieve or maintain (for example), but the wealth of properties you unlock by achieving it makes it a “cheat code for supply chain security” (among a number of other areas) in the words of our CEO Dan Lorenc. One of the many things that makes it fantastic for supply chain security is that anyone can audit the build process by simply re-running the build and checking that it produces the exact-same thing.

A lot of folks claim to have reproducible builds, but we are bringing our receipts. In this post, we will back up our claim by showing how anyone can reproduce a Chainguard Images build, and all you will need are cosign and apko.

Spoilers!

The following snippet will reproduce a particular image build:

-- CODE language-bash -- # The image we want to rebuild IMAGE_NAME=cgr.dev/chainguard/wolfi-base # Verify and download the Chainguard-signed configuration from which the image was built. cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}" | jq -r .payload | base64 -d | jq .predicate > latest.apko.json # Rebuild the image! apko publish latest.apko.json "ghcr.io/${USER}/${IMAGE_NAME}"

There are some caveats, which we will get into below, but the above commands will allow you to reproduce an identical digest for the images we publish.

If you want to know how all of this works, then let’s dive into the nitty gritty.

Locked image configurations

The first thing to understand is that the configurations we build our images from are not locked and float forward as new versions of packages land in Wolfi, which enables us to rapidly pick up CVE and bug fixes. In order to reproduce an image build we need the locked form of its configuration file. Consider the following package list for our wolfi-base image:

-- CODE language-bash -- contents: packages: - wolfi-base

You can determine the set of packages and versions that this will install by running:

-- CODE language-bash -- apko show-packages images/wolfi-base/configs/latest.apko.yaml

This list can be used to lock the configuration above (as of 2023/06/26):

-- CODE language-bash -- contents: packages: - apk-tools=2.14.0_rc1-r0 - busybox=1.36.1-r0 - ca-certificates-bundle=20230506-r0 - glibc=2.37-r6 - glibc-locale-posix=2.37-r7 - ld-linux=2.37-r7 - libcrypto3=3.1.1-r1 - libssl3=3.1.1-r1 - openssl-config=3.1.1-r1 - wolfi-base=1-r3 - wolfi-baselayout=20230201-r3 - wolfi-keys=1-r5 - zlib=1.2.13-r3

This form of the configuration file locks the versions of every package installed, sort of like a “lock file” in your favorite language’s package manager, and rebuilds of this configuration will always produce the exact-same image with a small set of caveats:

  • Different versions of the underlying tooling may change subtle things (e.g. compression levels, bug fixes),
  • If a package is withdrawn for some reason (generally rare), then you will stop being able to reproduce the image and the build will fail.

Getting the locked configuration

You may be asking: if we don’t build from locked configurations, then how can anyone reproduce our image builds? Ok, maybe not, thanks to the spoilers above, but the key is that we are rolling out an attestation on each of our images where we attest the locked configuration from which we built the image.

Cosign provides the following simple command for downloading this attestation (without verification):

-- CODE language-bash -- cosign download attestation \ --predicate-type https://apko.dev/image-configuration \ "${IMAGE_NAME}"

However, to confirm that this attestation was published by Chainguard’s Github Actions-based release process you can use the following command:

-- CODE language-bash -- cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}"

Both of the above commands fetch the raw attestation, including the DSSE and in-toto envelopes. You can extract the DSSE “payload,” base64 decode it, and then extract the in-toto predicate by piping the above through:

-- CODE language-bash -- jq -r .payload | base64 -d | jq .predicate

For example, the wolfi-base image currently returns:

-- CODE language-bash -- { "accounts": { "groups": [], "run-as": "", "users": [] }, "annotations": { "org.opencontainers.image.authors": "Chainguard Team https://www.chainguard.dev/", "org.opencontainers.image.source": "https://github.com/chainguard-images/images/tree/main/images/wolfi-base", "org.opencontainers.image.url": "https://edu.chainguard.dev/chainguard/chainguard-images/reference/wolfi-base/" }, "archs": [ "amd64", "arm64" ], "cmd": "/bin/sh -l", "contents": { "keyring": [ "https://packages.wolfi.dev/os/wolfi-signing.rsa.pub" ], "packages": [ "apk-tools=2.14.0_rc1-r0", "busybox=1.36.1-r0", "ca-certificates-bundle=20230506-r0", "glibc=2.37-r6", "glibc-locale-posix=2.37-r7", "ld-linux=2.37-r7", "libcrypto3=3.1.1-r1", "libssl3=3.1.1-r1", "openssl-config=3.1.1-r1", "wolfi-base=1-r3", "wolfi-baselayout=20230201-r3", "wolfi-keys=1-r5", "zlib=1.2.13-r3" ], "repositories": [ "https://packages.wolfi.dev/os" ] }, "entrypoint": { "command": "", "services": {}, "shell-fragment": "", "type": "" }, "environment": {}, "include": "", "options": {}, "os-release": { "bug-report-url": "", "home-url": "", "id": "", "name": "", "pretty-name": "", "version-id": "" }, "paths": [], "stop-signal": "", "vcs-url": "", "volumes": [], "work-dir": "" }

Connecting the dots

Now let’s see this in action! As of 2023/06/26 the latest digest of wolfi-base is sha256:5c15a6e5c0bf02e6c0eaa939cb543c41d7725453064c920b9a4faeea7c357506, so let’s run the following and see what digest we get!

-- CODE language-bash -- IMAGE_NAME=cgr.dev/chainguard/wolfi-base@sha256:5c15a6e5c0bf02e6c0eaa939cb543c41d7725453064c920b9a4faeea7c357506 cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}" | jq -r .payload | base64 -d | jq .predicate > latest.apko.json apko publish latest.apko.json "ghcr.io/${USER}/${IMAGE_NAME}"

… and voila the published digest is the same!

As a bonus “pro tip,” when an image digest does change, the crane project has a fantastic set of recipes for diffing the manifests, configs, and filesystems of images. We use these extensively to hunt down and eliminate discrepancies as they come up.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Don’t break the chain – secure your supply chain today!

Product

Reproducing Chainguard’s reproducible image builds

Matt Moore, CTO
July 5, 2023
copied

At Chainguard, we have built our Images product with reproducibility as a bedrock principle. Reproducibility is not a simple thing to achieve or maintain (for example), but the wealth of properties you unlock by achieving it makes it a “cheat code for supply chain security” (among a number of other areas) in the words of our CEO Dan Lorenc. One of the many things that makes it fantastic for supply chain security is that anyone can audit the build process by simply re-running the build and checking that it produces the exact-same thing.

A lot of folks claim to have reproducible builds, but we are bringing our receipts. In this post, we will back up our claim by showing how anyone can reproduce a Chainguard Images build, and all you will need are cosign and apko.

Spoilers!

The following snippet will reproduce a particular image build:

-- CODE language-bash -- # The image we want to rebuild IMAGE_NAME=cgr.dev/chainguard/wolfi-base # Verify and download the Chainguard-signed configuration from which the image was built. cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}" | jq -r .payload | base64 -d | jq .predicate > latest.apko.json # Rebuild the image! apko publish latest.apko.json "ghcr.io/${USER}/${IMAGE_NAME}"

There are some caveats, which we will get into below, but the above commands will allow you to reproduce an identical digest for the images we publish.

If you want to know how all of this works, then let’s dive into the nitty gritty.

Locked image configurations

The first thing to understand is that the configurations we build our images from are not locked and float forward as new versions of packages land in Wolfi, which enables us to rapidly pick up CVE and bug fixes. In order to reproduce an image build we need the locked form of its configuration file. Consider the following package list for our wolfi-base image:

-- CODE language-bash -- contents: packages: - wolfi-base

You can determine the set of packages and versions that this will install by running:

-- CODE language-bash -- apko show-packages images/wolfi-base/configs/latest.apko.yaml

This list can be used to lock the configuration above (as of 2023/06/26):

-- CODE language-bash -- contents: packages: - apk-tools=2.14.0_rc1-r0 - busybox=1.36.1-r0 - ca-certificates-bundle=20230506-r0 - glibc=2.37-r6 - glibc-locale-posix=2.37-r7 - ld-linux=2.37-r7 - libcrypto3=3.1.1-r1 - libssl3=3.1.1-r1 - openssl-config=3.1.1-r1 - wolfi-base=1-r3 - wolfi-baselayout=20230201-r3 - wolfi-keys=1-r5 - zlib=1.2.13-r3

This form of the configuration file locks the versions of every package installed, sort of like a “lock file” in your favorite language’s package manager, and rebuilds of this configuration will always produce the exact-same image with a small set of caveats:

  • Different versions of the underlying tooling may change subtle things (e.g. compression levels, bug fixes),
  • If a package is withdrawn for some reason (generally rare), then you will stop being able to reproduce the image and the build will fail.

Getting the locked configuration

You may be asking: if we don’t build from locked configurations, then how can anyone reproduce our image builds? Ok, maybe not, thanks to the spoilers above, but the key is that we are rolling out an attestation on each of our images where we attest the locked configuration from which we built the image.

Cosign provides the following simple command for downloading this attestation (without verification):

-- CODE language-bash -- cosign download attestation \ --predicate-type https://apko.dev/image-configuration \ "${IMAGE_NAME}"

However, to confirm that this attestation was published by Chainguard’s Github Actions-based release process you can use the following command:

-- CODE language-bash -- cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}"

Both of the above commands fetch the raw attestation, including the DSSE and in-toto envelopes. You can extract the DSSE “payload,” base64 decode it, and then extract the in-toto predicate by piping the above through:

-- CODE language-bash -- jq -r .payload | base64 -d | jq .predicate

For example, the wolfi-base image currently returns:

-- CODE language-bash -- { "accounts": { "groups": [], "run-as": "", "users": [] }, "annotations": { "org.opencontainers.image.authors": "Chainguard Team https://www.chainguard.dev/", "org.opencontainers.image.source": "https://github.com/chainguard-images/images/tree/main/images/wolfi-base", "org.opencontainers.image.url": "https://edu.chainguard.dev/chainguard/chainguard-images/reference/wolfi-base/" }, "archs": [ "amd64", "arm64" ], "cmd": "/bin/sh -l", "contents": { "keyring": [ "https://packages.wolfi.dev/os/wolfi-signing.rsa.pub" ], "packages": [ "apk-tools=2.14.0_rc1-r0", "busybox=1.36.1-r0", "ca-certificates-bundle=20230506-r0", "glibc=2.37-r6", "glibc-locale-posix=2.37-r7", "ld-linux=2.37-r7", "libcrypto3=3.1.1-r1", "libssl3=3.1.1-r1", "openssl-config=3.1.1-r1", "wolfi-base=1-r3", "wolfi-baselayout=20230201-r3", "wolfi-keys=1-r5", "zlib=1.2.13-r3" ], "repositories": [ "https://packages.wolfi.dev/os" ] }, "entrypoint": { "command": "", "services": {}, "shell-fragment": "", "type": "" }, "environment": {}, "include": "", "options": {}, "os-release": { "bug-report-url": "", "home-url": "", "id": "", "name": "", "pretty-name": "", "version-id": "" }, "paths": [], "stop-signal": "", "vcs-url": "", "volumes": [], "work-dir": "" }

Connecting the dots

Now let’s see this in action! As of 2023/06/26 the latest digest of wolfi-base is sha256:5c15a6e5c0bf02e6c0eaa939cb543c41d7725453064c920b9a4faeea7c357506, so let’s run the following and see what digest we get!

-- CODE language-bash -- IMAGE_NAME=cgr.dev/chainguard/wolfi-base@sha256:5c15a6e5c0bf02e6c0eaa939cb543c41d7725453064c920b9a4faeea7c357506 cosign verify-attestation \ --type https://apko.dev/image-configuration \ --certificate-oidc-issuer https://token.actions.githubusercontent.com \ --certificate-identity https://github.com/chainguard-images/images/.github/workflows/release.yaml@refs/heads/main \ "${IMAGE_NAME}" | jq -r .payload | base64 -d | jq .predicate > latest.apko.json apko publish latest.apko.json "ghcr.io/${USER}/${IMAGE_NAME}"

… and voila the published digest is the same!

As a bonus “pro tip,” when an image digest does change, the crane project has a fantastic set of recipes for diffing the manifests, configs, and filesystems of images. We use these extensively to hunt down and eliminate discrepancies as they come up.

Related articles