24 June, 2024•6 minute read

Conventional changelogs suck

"Conventional commits" are a (pseudo)standard for authoring Git commit messages. Following this standard yields semi-structured commit message that can be consumed by tooling. Here’s an example conventional commit message, which communicates the addition of a new feature:

feat(api): add an id to each incoming request

Conventional commits begin with a “type” and a scope, followed by a colon and a short descriptive message. A longer body can be included so long as it is separated from the first line of the commit with two newlines.

There’s also some special syntax for highlighting breaking changes. Here is one such commit message:

chore!: remove support for Node 10

Conventional commits are pretty nice when your team is bought in on the convention and has the discipline to make lots of small atomic commits. It’s really helpful being able to immediately see where breaking changes were introduced, and there is cool tooling available to automatically bump your project’s version according to semver based on your conventional commit log.

Where things start to go a little off the rails, however, is when people try to generate changelogs off their conventional commits. Much like microfrontends it’s one of those things that sounds like a good idea at first, but turns out pretty badly

Commit history shouldn't be used to generate release notes

One of these tools is called conventional-changelog. The problem with this tool and others like it is that they fundamentally misunderstand the purpose of commit history.

Commit messages are an internal-facing log of what changed. The target audience of any commit message you author are the other engineers on your team. Most commit messages are simply a short summary of what changed in the commit, and understanding these summaries generally requires the reader to have context about the rest of the codebase.

There are other forcing functions at play which result in commits being more atomic. Tools like git bisect become increasingly more useful as commits get smaller, and rebasing around merge conflicts is also easier when commits are smaller.

The end result is that high-performing engineering teams will end up making small, frequent commits. These small commits naturally end up focused on fine-grained technical details of the codebase.

Your customers do not care about these granular details. They don’t care to know that v1.3.7 was the release that added the ProductSearchService class, or that v2.1.1 refactored how PlaylistProvider passes data down to SongCard. Customers purchase your software because they have a problem which needs solving, and you shifting around technical implementation details doesn’t do anything to solve their problems directly.

Customers are much more interested in what use cases your changes add support for, and how the product changes you’re making will better support them in future. The commit log is a poor artifact to use for communicating these things, because the commit log is intended for a fundamentally different audience ¹.

You could write your commit messages in a format that's well suited for a changelog, but then you're compromising on your team’s internal developer experience. Commit history is a valuable form of institutional knowledge about the codebase and trading that off in favor of "automating" changelogs is largely an exercise in folly.

How should you generate changelogs instead?

The key insight here is that the audience for your commit history and your changelog are very distinct from each other. These different audiences care about different things. Any changelog system based on extracting commit messages is therefore doomed to fail.

Some projects maintain a dedicated CHANGELOG.md or HISTORY.md file by hand, but solving merge conflicts can be a bit of a chore. At Rye we bypass this by writing the month’s changelog by hand at the start of the following month; we do look at the commit history to ensure we don’t forget something, but the human in the loop is responsible for editorializing and contextualizing the changes.

Outside of my day job, I like using a little Python program called towncrier. Rather than maintaining a big CHANGELOG.md file or engaging in a heavy-handed monthly writing exercise, towncrier operates on “news fragments.” Whenever you make a change that you want to include in your changelog, you simply add a new file to a news/ directory (here’s an example).

When you're ready to cut a release towncrier will stitch your news fragments together and throw them in to your changelog file. This workflow completely avoids merge conflicts, and decouples your release notes from your commit history.

In the course of writing a news file you may very well end up duplicating something you've written inside your commit message or pull request body. That's okay. It’s not impossible for a particular piece of information to be useful for both the engineering team and your end customers—it’s just uncommon. A little bit of duplicate work is not the end of the world; at least with this system you are being intentional when you double up, rather than letting an automated tool blindly regurgitate your commit history.

Customer-centric changelogs in action

Let’s consider a real world example from the Rye changelog. I started working at Rye in March, so we’ll consider the difference between the February 2024 and March 2024 changelogs.

Here is an example entry from the February 2024 changelog:

Added returnOrder mutation. For more details, refer to the documentation.

And here is one from March 2024:

Developers now receive PRODUCT_UPDATED webhooks whenever products they have requested tracking for have their data updated. This removes the need to poll our API for changes via cron.

There’s a big difference between these two changelog items. The first conceivably looks like something we could have pulled from our commit log, with a docs link tacked on at the end.

This changelog entry describes what was changed, but does very little to help a reader understand why the change was made, or what use cases it enables. The reader can make some inferences as to what returnOrder does, but the changelog doesn’t stand on its own.

In contrast, our March 2024 changelog entry clearly spells out why the change is exciting for developers building on top of Rye’s API. Prior to this change, developers that wanted to maintain a local copy of Rye’s product data needed to build out a cronjob that would periodically poll the API for updated data. This isn’t a huge amount of work, but it did add friction for people wanting to use our API.

There is no need for a developer looking at the March 2024 changelog to try and guess the benefit of the PRODUCT_UPDATED webhook. Developers can simply read the change, and immediately come away with an understanding that the Rye engineering is working hard to make their life easier and add value.

There are other examples here as well. For instance, this schema change we made in March 2024:

ShopifyVariant.weight field is now typed as Float instead of an Int, as it is possible for an item’s weight to be a decimal value.

This is a very simple change that doesn’t really need the final explanatory clause, but by including the last bit of the sentence we are offering some reassurance to our customers. We haven’t simply made this schema change because we felt like it, or because we wanted to make life arbitrarily more difficult for folks building with our API. We made this change because we had a legitimate bug in our schema that caused type mismatches between us and Shopify!

Note, also, that the headers used to organize the March 2024 changelog differ significantly from the ones used in January and February 2024.

The earlier changelogs preferred conventional commit-style headers like “Fixes” and “Features,” whereas the March 2024 changelog introduced more domain-oriented headers like “Developer Console.” Customers are interested in what component of the product has been changed; it makes sense to organize changelogs around this.

Conclusion

While conventional commits serve as a valuable tool for maintaining a structured and informative commit history within development teams, they fall short when repurposed for generating customer-facing changelogs. Commit messages are tailored for an internal audience, focusing on granular technical details that are meaningless and distracting to end users. Effective changelogs, on the other hand, need to be customer-centric, highlighting the impact and benefits of changes in a way that contextualizes them to the customer’s pain points.

Maintaining a clear distinction between commit logs and changelogs ensures that both developers and customers receive the information they need. Tools like towncrier employ workflows which enforce this distinction, resulting in better outputs. While curating a changelog requires more upfront investment relative to shipping the commit log as-is, the end result is generally worth it. By putting a bit of time and thought into packaging your changes, you can emphasize to customers the value you are adding for them. This approach enhances the user experience and reinforces the importance of thoughtful communication in software development.

For open source developer tools the situation is a bit more nuanced, and it can be a nice convenience to include links to important commits or PRs in your changelogs. For changelogs facing consumers or non-engineering teams, however, this issue is very black and white.↩

Essay

Git

« Newer post

Bulletproof GraphQL polling

Older post »

Be a tidy kiwi