How to purge your CDN from a pipeline
When you deploy a new frontend bundle it’s a good idea to purge your CDN to ensure you won’t serve old files to clients. In theory this is simple to do as all CDN providers provide APIs for programmatically purging their cache, but in practice it can be tricky. The problem is that most CDNs only let you purge their cache using an ID rather than by name.
What winds up happening is that someone will hardcode a CF_DISTRO_ID
variable in the pipeline. This isn’t the worst thing in the world, but it’s also far from best practice and carries drawbacks:
- Destroying and recreating your infrastructure no longer works transparently. You’ll need to go in and manually update your environment variable. This is the antithesis of immutable infrastructure, and high-performing DevOps teams don’t put up with chores such as this. Automation is better than manual human actions.
- You can’t purge branch-based previews. If your branch-based preview deployments are behind a CDN and you spin up fresh infrastructure for each preview, then you won’t have the opportunity to hardcode an ID in the first place as the CDN is ephemeral.
Fortunately, the scripts required to look up a CDN’s distribution or service ID are fungible and can simply be copy/pasted. In the next section I’ve included scripts for a variety of popular CDNs—feel free to use them.
Note that all of the scripts provided are written to the purge everything from the CDN which might not be what you want.
If you’re using dynamic imports for instance then you actually don’t want to purge old Javascript bundle files! Someone running an old version of your app without all of your app’s chunks in their browser cache will experience network errors when they try to load a new chunk if you’ve purged the entire cache. A lot of teams miss this detail and wind up running into chunk loading errors.
In general limiting the purge to your index.html
file is a good start. The rest of your files are usually fine to leave sitting in cache, but you’ll want to pay special attention to anything with a static file name (such as index.html
!) because CDNs typically cache based on file path.
Pipeline scripts
Cloudflare
Cloudflare has one of the worst developer experiences because there is no official CLI tool. There are a couple of third-party offerings (such as cloudflare-cli
) but from what I can find they’re not well maintained. We need to resort to manually crafting API requests via curl
or similar.
The script below assumes you’re using an API token. If you’re using a legacy API key then instead of passing an Authorization
header you’ll want to send X-Auth-Email
and X-Auth-Key
headers.
CF_ZONE_ID=$( curl -s --request GET \ --url https://api.cloudflare.com/client/v4/zones \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ${CF_TOKEN}' \ jq -r '.result[] | select(.name=="${TARGET_DOMAIN}") | .id') if [[ ! -z "${CF_ZONE_ID}" ]]; then curl --request POST \ --url https://api.cloudflare.com/client/v4/zones/${CF_ZONE_ID}/purge_cache \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ${CF_TOKEN}' \ --data '{ "purge_everything": true }'fi
While writing this script isn’t pleasant, Cloudflare does offer very flexible cache purging options. The script here purges everything from the cache, but you’re able to purge based on path prefixes, host names, cache tags, and more. Check out their documentation for more information.
CloudFront
CloudFront doesn’t allow us to invalidate a distribution by name directly, so we first need to figure out the distribution ID. The AWS CLI can print out a list of distributions in JSON format, and we can use jq
to extract what we’re after before proceeding with the invalidation:
CF_DISTRO_ID=$( aws cloudfront list-distributions | \ jq -r '.DistributionList.Items[] | select(.Origins.Items[].DomainName=="${TARGET_DOMAIN}") | .Id') if [[ ! -z ${CF_DISTRO_ID} ]]; then aws cloudfront create-invalidation --distribution-id "${CF_DISTRO_ID}" \ --paths "/*"fi
The AWS CLI automatically paginates through your distributions, which means this script will continue to work even after your organization has scaled to thousands of CloudFront distributions.
The output from the list-distributions
command is extremely detailed, so it’s possible to find your distribution via a number of different fields—all you have to do is swap out the .Origins.Items[].DomainName
filter for something else. You can see an example of the output from list-distributions
right on its documentation page.
Fastly
Fastly makes things extremely easy compared to other CDN providers as you can purge your CDN directly by service name instead of needing to deduce an ID:
# Make sure to set FASTLY_API_TOKENfastly purge --service-name="${FASTLY_SERVICE_NAME}"
There’s a fastly service list
command which lists out your available services, but the output is nowhere near as rich as CloudFront’s list-distributions
command so the only real option here is to purge by name.
Conclusion
Good pipelines and infrastructure management involves automating as much as humanly possible to maximize efficiency. Toil kills team velocity and reduces reliability as humans are liable to make mistakes.
While hardcoding a CDN service ID into your pipeline configuration might not be the end of the world, there are drawbacks to taking this shortcut. With the scripts provided in this post there’s now no excuse at all to cut this corner in your frontend pipelines.