Intro

On an application I'm developing at work, I created some docker "base" images. These images contained all the dependencies/libraries needed to run the application, so when we need to build a new version of our app, instead of installing everything from scratch, we just re-use the base image that is ready. This image is then uploaded into an internal registry and our dockerfiles point to this image.

As the application grew, I put the dockerfile of these base images on a separate repo. As these files are not really the core of the application, I decided it was better to have a separate repo along other stuff, like backups and housekeeping pipelines.

All good to this point, but after a while I learned this method could not be the best when you need to update the base image. Here is why.

what happened?

Our dockerfile in our app, points to the base image on the internal registry using the FROM instruction, then more instructions are used to prepare the image for the application code, like copying the actual code or preparing environmental variables config.

After a while, I needed to update the base image, as a new package was necessary on the application, and as this package was always used, it makes sense to put it on the base image.

I updated dockerfile for the base image, built the image and uploaded it to the internal registry. Then, went to the application repo and update the code, so we can start using the package.

The issue

All good for a while, then a co-worker told me that the application was failing in his environment, send me an screenshot and effectively, and error was there. The application was not able to find the new package I added to the base image.

After looking for a while and after a walk, I realized what happened; Docker Cache was the one to blame.

RCA

As you know docker works on layers, that's the reason is really fast, but at the same time, it can cause issues, like on this occasion.

If you want to see how a layers looks like, you can go for example to this link and see how php was built.

Or you can see that from your terminal, in the example below we have pihole:

docker
╰─ docker images
REPOSITORY                   TAG       IMAGE ID       CREATED        SIZE
pihole/pihole                latest    80aaf2c8cb2f   3 months ago   300MB

╰─ docker image history 80aaf2c8cb2f
IMAGE          CREATED        CREATED BY                                      SIZE      COMMENT
80aaf2c8cb2f   3 months ago   SHELL [/bin/bash -c]                            0B        buildkit.dockerfile.v0
<missing>      3 months ago   HEALTHCHECK &{["CMD-SHELL" "dig +short +nore…   0B        buildkit.dockerfile.v0
<missing>      3 months ago   ENV PATH=/opt/pihole:/usr/local/sbin:/usr/lo…   0B        buildkit.dockerfile.v0
<missing>      3 months ago   ENV DNSMASQ_USER=root                           0B        buildkit.dockerfile.v0
<missing>      3 months ago   ENV FTL_CMD=no-daemon                           0B        buildkit.dockerfile.v0
...

Getting back to the story; as the image name from the FROM instruction never changed, docker was not aware that the base image was updated, therefore it was using the old version that was cached on my co-worker laptop, and then failing as this old image was didn't have the package required.

docker digest

This is when docker docker digest comes handy, instead of using a tag to specify which image to use, you can specify a hash and use it to pull the image you want to use.

To use docker digest, get the hash of the image you want to use and then use on the dockerfile like with ngix container.

docker
FROM nginx:0.0.1-alpine@sha256:829a63ad2b1389e393e5decf5df25860347d09643c335d1dc3d91d25326d3067

Docker digest are avaiable on every registry, but if you want to find it from the cli, you can run this command

docker
╰─ docker images --digests
REPOSITORY                   TAG       DIGEST                                                                    IMAGE ID       CREATED        SIZE
pihole/pihole                latest    sha256:933651dcb71ad4d581ab6f5039c88ac4899d43283ed95ce0154f610117c3d2c8   80aaf2c8cb2f   3 months ago   300MB

Conclusion

The nice part is that once, I knew why it failed, I just update the application repo to use this new hash and the problem was gone. I also added a note on the other repo, that the application dockerfile has to be updated when the base image is updated.

Ideally, this should be done via a pipeline, so no manual work has to be done, but for now is ok just to add the warning as a comment, since these changes are not frequent.