2 January, 20247 minute read

Intentionally breaking types to guide scary refactors

At my last job we sold a quoting SaaS to tradespeople, and naturally a big component of our software was the quote builder itself. Tradies would get customers to jump on their portal, and from there the customer would input all the details of their job into our quote builder. After a bit of backend magic, the customer and tradie alike would receive an accurate price for the job to be done.

There were two main inputs we needed to handle. For a job to install new tiling in a kitchen and bathroom the customer would input two “areas” (one kitchen, one bathroom), and then for each area they would select “services” such as underfloor heating or wall tiling.

Originally we’d built the quote builder such that you could only have one of each area included in your job. That didn’t work quite so well for cases where someone wanted to get two different bathrooms tiled as part of the same job, so we needed to refactor.

A screenshot of Kwotimation's quoting software. The user is presented with a list of tiling services (e.g. floor tiling) that they can select for their bathroom area.
A user selecting servies for their bathroom area in our quoting SaaS.

To be really specific here, we need to add an id (or key) field to the below interface. Up until this point we had been using areaId to uniquely identify areas which the customer had added in the quote builder, but this breaks down when you have two distinct “Bathroom” areas and want to be able to add/remove services to each one independently.

types.ts
Click to copy
interface QuoteArea {  // the ID of an area table row  // e.g. "d2f1e61f-483d-464a-98dc-6d36d3eb0bf4"  areaId: string;  // list of services the customer wants  // for this area  services: QuoteService[];  // e.g. "Ready", "Needs Demolition", ...  workState: WorkState;}

How would you go about performing this refactor? The challenge here is to ensure that all codepaths currently using areaId as the unique identifier are correctly updated to use our new id field.

LSPs are fantastic for mechanical tasks like renaming a function or shifting an export from one file to another, but they don’t offer much for semantic changes such as this one.

Tests can be a helpful guardrail in some cases, but there’s no guarantee they’ll exercise the semantics we are changing and we’ll likely need to add new ones which will push out our timeline.

At the time I made this refactor the company was still in its infancy, and our software was constantly changing. In this situation, tests don’t tend to stay relevant long enough for them to pay dividends. “Ship first, ask questions later” is a good mantra to live by in an early-stage startup.

Fortunately, it’s actually entirely possible to both have our cake and eat it. We can have a strong guarantee that we updated everything we need to while also not eating any additional overhead costs. We can achieve this by inverting our usage of the type system.

In other words: we typically use type systems to keep our program working. In this case, we’re going to use the type system to intentionally break our program.

Intentionally breaking the types

Here’s what this technique concretely looks like. Imagine we have the following Redux action creator:

actions.ts
Click to copy
function addService(  areaId: string,  serviceId: string,) {  // ...}

Updating the callsites of this function is pretty easy on paper. All we need to do is list out all uses of this function and then swap them over from area.areaId to area.id. But in practice it’s really easy to miss one and wind up with bugs.

To guarantee we’ve correctly updated every callsite, we’ll make the following change:

actions.ts
Click to copy
function addService(  // Note the type change!  quoteAreaId: number,  serviceId: string,) {  // ...}

After we make this change, every single usage of this function will fail to typecheck. If we tried to PR the code in this state, our CI tests would fail and we’d be prevented from merging it in.

Our list of compiler errors now serves as a “to do” list that we can refer to. We can work through this list one item at a time, and when we’ve fixed all our errors we will know with 100% certainty that everything has been updated. Here’s what updating a callsite looks like when using this technique:

components/ServiceStep.tsx
Click to copy
dispatch(addService(  quoteArea.id as unknown as number,  serviceId,));

We forcibly cast quoteArea.id to number here because this type change is temporary. We’ve only made id a number for the purposes of getting a nice to do list from the compiler. When we’re finished updating all our code to use the new field we’ll revert the type back to string and drop all the casts. There’s no possibility for human error to creep in when dropping the casts because the compiler will give us a new list of type errors when we swap id over to string.

Simply work through the type errors, run the app in your browser to test everything’s working as expected, and then swap back to the correct type. After that work is done, you’re finished. And you know for sure that you haven’t missed anything.

What if you don’t have nice abstractions?

The thing about this technique is that it isn’t specific to functions. You can apply it anywhere types are involved.

Let’s say that your architecture isn’t so great and the business logic of interacting with these QuoteArea objects is littered throughout your codebase. While you won’t be able to tweak the signature of a function like addService, you absolutely can just change the definition of QuoteArea:

types.ts
Click to copy
interface QuoteArea {  id: string;  // areaId: string;  // ...}

If you simply comment out the field then every single location in the code which uses that field will throw an error. We could also choose to stick with the approach we took in the previous example, and instead swap areaId: string over to areaId: number. Both options work just fine.

Strictly speaking there are some subtle differences between the two options. A log message which outputs areaId won’t complain about the type change, but will complain about the deletion of the property for instance. Sometimes this difference can matter!

The downside of this approach is that everything touching that field is going to cause a compiler error—even code that’s supposed to refer to the old value. Consider the case of some code responsible for creating a new area and adding it to the state of the quote. We still need to store the areaId regardless of the existence of our new id field, but because we’ve adjusted the underlying QuoteArea type we’ll get a compiler error. That’s noise.

The added noise you take on by adjusting the type is obviously going to depend on how widespread the usage of the type is. The messier and larger your codebase, the more noise you’ll have. In TypeScript it’s at least easy to quickly sift through these extraneous errors by using compiler comments:

reducer.ts
Click to copy
quote.areas.push({  id: createUuid(),  // @ts-expect-error  areaId: area.id,  // ...});

The @ts-expect-error comment will silence TypeScript’s error complaining about how QuoteArea#areaId either doesn’t exist or has the wrong type while we work through out list of errors, and unlike a @ts-ignore it will noisily cause an error when we roll back to using the correct types instead of sitting there silently.

Which brings me to my next point: this technique actually works with any language that sports a static type system, but it’s most useful when writing TypeScript. Other languages—broadly speaking—just don’t have the same degree of flexibility in their type system.

In C++, for instance, while you can change the type of QuoteArea#areaId and use casts to work through type errors, you don’t benefit from TypeScript’s separation of runtime and build time. Swapping areaId’s type in C++ results in an observable difference in runtime behavior which limits your ability to run and test the code while you work through the compiler errors. It’s just not as ergonomic.

Branded types

Another option that’s available to use in TypeScript which may or may not be available in other languages are “branded types”. Instead of swapping out string for number, we could instead have performed a refactor like this:

types.ts
Click to copy
type Id<Type> = string & { __type: Type }; interface QuoteArea {  id: Id<'QuoteArea'>;  areaId: Id<'Area'>;  // ...}

The advantage of this strategy is that after you do the forward pass through the initial set of compiler errors you are finished. You don’t need to swap your Id<'Area'> type back over to a string because Id<'Area'> is, in fact, now the correct type for areaId.

The downside of this approach is that generally it will increase complexity elsewhere. A stock-standard API client generated from an OpenAPI spec is going to type your API’s Area#id field as a string, so you’ll need to manually cast the value over to the branded type somewhere anyway. Branded types make a lot of sense when you have complete control over your tooling, but outside of that they tend to add a bit of friction.

Types are a superpower

Static type systems are pretty awesome. They’re an indispensable asset for avoiding bugs, and a good type system can completely remove the need for entire classes of unit tests. Going back to dynamic languages after you’ve gotten over the initial learning curve of static typing is painful.

But those use cases only scratch the surface of what’s possible with a static type system. Static types give us the ability to confidently break our program, knowing that once we’ve solved all the compiler errors it’ll work again.

The areaId refactor I’ve talked about in this post would have been a massive undertaking in a dynamically typed language and likely would have taken close to a week, if not longer.

But in this case I was able to complete the refactor in a little over an afternoon. Breaking my program and fully leaning in to the type system saved me a huge amount of time.

When I was first onboarded to that company, one of the very first things I did was instate a rule that all new code had to be written in TypeScript for this very reason. Static type systems remove risk from business and very rapidly pay for themselves.

Big scary refactors are a lot less scary when you have the right tools and know how to use them correctly.

Don't want to miss out on new posts?

Join 100+ fellow engineers who subscribe for software insights, technical deep-dives, and valuable advice.

Get in touch 👋

If you're working on an innovative web or AI software product, then I'd love to hear about it. If we both see value in working together, we can move forward. And if not—we both had a nice chat and have a new connection.
Send me an email at hello@sophiabits.com