30 April, 20242 minute read

Add a base to your multistage Docker build

Most people know that it’s possible to use multi-stage Docker builds to minimize final image size, but multistage builds can also be used to improve maintainability and readability of your Dockerfiles. Introducing additional stages is somewhat analogous to extracting out a function in your codebase, and allows you to centrally define configuration for use in multiple subsequent stages.

First, let’s consider one particular source of redundancy in Dockerfiles: A hardcoded base image reference which appears in multiple locations. Below is a specimen Dockerfile demonstrating this antipattern, in which the base node:20-slim image is referenced multiple times:

Click to copy
FROM node:20-slim AS deps# install prod deps FROM node:20-slim AS build# install dev deps + compile app FROM node:20-slim AS release# copy build + deps content

This is a very common Dockerfile setup when building Node.js services. It can be improved with the introduction of an ARG which stores the name of the base image, like so:

Click to copy
ARG IMAGE=node:20-slim FROM $IMAGE AS deps# ... FROM $IMAGE AS build# ... FROM $IMAGE AS release# ...

This both DRYs up the Dockerfile and gives us the ability to parameterize the base image through the --build-arg CLI option; particularly useful for matrix testing. But this still leaves us with a problem: what if we need to tweak our base image, and want those tweaks available in all of our subsequent build stages?

We would be forced to repeat ourselves with this Dockerfile, and that comes with the risk of us making a mistake and running our setup commands in a different order across stages—resulting in layer cache misses.

We can solve this problem by introducing another build stage. I tend to call this one base, and it’s where I pull in my base image and perform any common configuration I need to do. All other stages in my Dockerfile use this base stage as their base.

In the below Dockerfile, I’m upgrading the packages in my container to the latest versions in the hopes that I won’t get caught out by missing security updates. Because I do this inside my base stage and every other stage is derived from base, I only need to write the apt-get commands a single time:

Click to copy
ARG IMAGE=node:20-slimFROM $IMAGE AS baseRUN apt-get update && apt-get upgrade -y FROM base AS deps# ... FROM base AS build# ... FROM base AS release# ...

My Dockerfile is now perfectly DRY. This technique is pretty handy in situations where you have complicated setup logic going on in your Dockerfile. When we tried moving to ARM EC2 instances at Crimson, we ran into problems because Prisma—our ORM—wasn’t shipping ARM binaries at the time.

We worked around this by building our own binaries, and to consume those binaries correctly we needed to do a bit of work inside our base Docker build stage. The alternative would have been to copy/paste about 9 lines of setup across all of our other build stages, which would have added significant bloat and maintenance burden to our Docker setup.

Don't want to miss out on new posts?

Join 100+ fellow engineers who subscribe for software insights, technical deep-dives, and valuable advice.

Get in touch 👋

If you're working on an innovative web or AI software product, then I'd love to hear about it. If we both see value in working together, we can move forward. And if not—we both had a nice chat and have a new connection.
Send me an email at hello@sophiabits.com