Some engineers in the containers community have advocated for small images. These "small image" proponents support using base images such as distroless and alpine and image optimization tools such as Docker Slim. But small images can still be complex, and the complexity is the true enemy, not simply size. By analyzing an actual example, this blog post demonstrates this point. Additionally, this post, which builds on the All About That Base Image white paper and webinar, provides broader insights about how images accrue technical debt and how this debt can be minimized.
The composition process of images
First, let’s have a look at how images are built in theory. In general, developers use Docker to compose images using a Dockerfile, which either install packages or run build commands to build an application. These packages and build artifacts can be considered as components in the final image which gets deployed. Let’s take a look at a basic example of a Dockerfile, which uses Alpine as an example.
This Dockerfile takes the latest Alpine image and installs the nginx package inside it, but what is the actual technical debt of this image? We can use syft to find out:
According to syft, there are 16 packages in the image. Each of these packages is a source of technical debt, despite the image being only 9 MB!
An analysis of eleven of the most popular base images (identified via GitHub code search in a previous whitepaper) suggests a similar finding. While the size of an image and the number of components do have a moderately strong positive correlation, there are instances of images that are approximately the same size (in MB) and yet have a drastically different number of components.
To reduce debt, reduce image complexity not size
As illustrated above, images are built out of components which are layered on top of each other. While increased image complexity typically does involve increased image size, the important metric is the number of underlying components. The goal is always to reduce the number of components that are put into an image during the authoring process, which means that image size reduction tools often miss the point – scanners generally still have to deal with the same package set – while introducing their own technical and legal risks (a popular image optimization tool deletes license text from the images it optimizes making those images not legally redistributable!).