Engineering

Designing build date epoch in Chainguard Images

Matt Moore, CTO
June 8, 2023
copied

In this post, we are going to walk through how we make Chainguard Images reproducible without the pesky problem where the images show up as created in 1970.

A little bit of background for the uninitiated. Typically container images carry a “created_at” timestamp in their “config file” reflecting when the container image was built. However, since this config file’s hash is part of a Merkle tree, which allows the container to be pulled by digest and how git works, any non-determinism in this timestamp results in the digest for the image changing even if everything else remains the same.

To address this non-determinism, a common practice became to set the timestamp to what’s called the Unix Epoch, which is midnight of January 1st 1970. above, this has led to all sorts of “fun” UX problems over the years.

To address this problem, the reproducible-builds.org group devised a scheme commonly called “source date epoch”, which established a way for users to configure a deterministic timestamp for the build tool to use, which is a function of the source input.

For folks using git they would do this via something like this:

-- CODE language-bash -- export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)

What this does is direct any build tools (that adhere to “source date epoch”) to use the timestamp of the most recent git commit as the timestamp within any artifacts produced. This means that any builds done at a particular commit will be binary-equivalent and the timestamps will match.

So why don’t we just use “source date epoch”?!?!  It sounds great!

In fact, we do! When we perform our APK builds for Wolfi, and our downstream enterprise Chainguard APKs, we encode the timestamp of when we last changed our packaging configuration file as the “source date epoch”:

-- CODE language-bash -- git log -1 --pretty=%ct --follow $(pkgname).yaml

Since each package pins to a particular version of the software, it only changes when the source or configuration changes (similar to the Merkle tree above):

The derived “source date epoch” timestamp gets encoded into the APK’s “control data” as the builddate, and the APKINDEX’s entry for that package as the “build time” (aka t:).

So what’s a “build date epoch” then…?

Things get a bit more complicated when we produce the Chainguard Images from these APKs because when we produce our Images we generally want to pick up the latest versions of these packages, often in order to fix CVEs. What this means is that unlike above, the timestamp of the configuration file does not reflect the timestamp we should use because the configuration file itself changes very rarely, but it will still continue to pick up newer packages than the configuration file itself. If the configuration file does change, then the image digest will change as well.

To account for this, we compute what we have dubbed the “build date epoch,” which is effectively the MAX(builddate) from the installed APKs and as we’ve established above for Wolfi and Chainguard APKs, it is based on a stable “source date epoch” of their respective packages.

In essence, “build date epoch” enables us to achieve a transitive form of “source date epoch,” and because of this the image digests will largely change only when one of their packages has a new version.

Want to learn more about how we build Chainguard Images? Watch our live demo with Chainguard CEO Dan Lorenc or visit Chainguard Academy.

Related articles

Ready to lock down your supply chain?

Talk to our customer obsessed, community-driven team.