1 August, 2023•5 minute read

Centralized resource ID generation with Drizzle

I’m a huge fan of type-prefixed IDs because they make development so much more pleasant. On a macro level it’s nice to immediately identify the type of an object from its ID, and on a micro level being able to double-click-copy an ID is really nice. UUIDs are a pretty sucky choice.

One of the problems you may run into with type-prefixed IDs is that it can be hard to stay consistent with your ID prefixes. Encoding the prefixes into your database schema helps a bit, but isn’t always an option as providers like PlanetScale have no support for stored procedures and might not have built-in support for generating your preferred ID format.

The only truly portable method for generating these IDs is to do them in application code. This avoids any platform limitations and sets you up nicely for a future where you need to minimize database round trips, but it opens you up to human error. How do you guarantee that everyone calls ResourceId.generate('evt') when generating an ID for an event object?

In Drizzle, it’s pretty easy to make that guarantee.

Step 1: Brand your ID fields

All of Drizzle’s column builders have a $type method which allows you to refine the inferred TypeScript type of that column. This allows you to easily use branded types with your Drizzle tables, like so:

Click to copy

import { generate, type ResourceId } from 'resource-id'; export const subscription = mysqlTable('subscription', {  id: varchar('id', { length: 80 })    .notNull()    .primaryKey()    .$type<ResourceId<'sub'>>(),    // => `id` is a `ResourceId` with the prefix "sub"  // ...}); // ... await drizzle.insert(subscription).values({  id: 'foo',  // Error: Type 'string' is not assignable to type 'ResourceId<"evt">' ...}); await drizzle.insert(subscription).values({  id: generate('evt'), // ✅});

At this point we’ve largely solved the problem, and we even have the added benefit of having branded all of our id column types. It’s now extremely ergonomic to write functions leveraging those branded types for additional type-safety! This is impossible to do out of the box with Prisma due to the constraints of their codegen.

Note that Drizzle keeps you safe when using $type. You can only narrow the column’s type, which means the type you pass it needs to be assignable to the column’s default type. Calling $type<number>() here won’t work because you can’t assign a number to a string (the default type inferred for a varchar column).

Passing a number here would result in funky behavior at runtime anyway, because the generic type parameter here doesn’t event exist at runtime—and therefore can’t modify the type that Drizzle gives you when you read a subscription row.

One way to level this up further is to write a centralized newId function that takes in a Drizzle table and spits out an appropriate ID value for that table. We’ll talk about that further in the next step.

Step 2: Write a function for generating IDs

A central newId function allows us to move all of our ID generation code into one place, making it easy to swap out our ID generator down the line or add instrumentation. It’s pretty straightforward to write this function, and it’s even possible to write it in such a way that it’s guaranteed to be exhaustive at compile-time.

If you’re following my preferred Drizzle project structure and using MySQL for your database then you can use the following function as-is:

Click to copy

import type { MySqlTableWithColumns } from 'drizzle-orm/mysql-core';import * as ResourceId from 'resource-id'; import * as schema from '../schema'; type Schema = typeof schema;type Tables<T> = T extends MySqlTableWithColumns<infer C>  ? MySqlTableWithColumns<{ name: C['name']; schema: any; columns: any }>  : never; function isTable<Name extends string>(  x: MySqlTableWithColumns<any>,  y: MySqlTableWithColumns<{ name: Name; schema: any; columns: any }>,): x is typeof y {  return x === y;} export function newId(table: Tables<Schema[keyof Schema]>): string {  if (isTable(table, schema.subscription)) {    return ResourceId.generate('sub');  } else if (isTable(table, schema.name)) {    return ResourceId.generate('user');  } else {    assertUnreachable(table);  }} function assertUnreachable(x: never): never {  throw new Error(`Table ${x} does not have a case statement in newId()!`);}

If you’re using a different database then you’ll need to swap out the MySqlTableWithColumns type import for your database’s equivalent. If you’re on postgres, for instance, you’d use the PgTableWithColumns type from drizzle-orm/pg-core instead.

The code works like so:

We import our schemas/ barrel file.
We extract the table definitions from that barrel import. In my preferred project structure, I export Zod types created via createSelectSchema as part of my schemas/ barrel which means I need to exclude values like schema.User. This is done with a distributive conditional type.
We define a function newId that’s typed as taking in one of our table objects.
We match the table parameter to one of our schema’s tables. We have an isTable helper function that type narrows our table parameter. You can’t just check if table === schema.subscription because that won’t narrow down the Tables union type.
- Note that it looks like Drizzle’s tables have an _ property which in turn has a name property. In theory you’d be able to use this to type narrow with a switch statement, but the _ property doesn’t actually exist at runtime. The _ property is essentially a type brand.
We if/else through all of our tables, returning an appropriately-prefixed ID when we find a match. I’m using my own resource-id package (the latest prerelease now works in the Edge runtime!), but you can use any package. typeid-js looks nice.
After we’ve gone through all of our tables, we add a final else that calls a helper function called assertUnreachable. This function is immensely useful and follows me around to almost every project I work on—it uses the never type to verify at compile-time that a switch statement (or in this case, an if/else) chain is exhaustive. If we missed checking a table, we’ll get an error at compile-time.
- Ryan Cavanaugh from the TypeScript team has a solid Stack Overflow answer about this function here.

You can import newId anywhere in your code and are guaranteed to get the correct ID format out for your particular table. Here’s some usage in my most recent side project:

Click to copy

import { event } from '@gethook/db/schema'; // ... await drizzle.insert(event).values({  id: newId(event),  data,  webhookId,}); // ...

The ability to write this function isn’t unique to Drizzle, although Drizzle does make it easy. Writing newId for Prisma is a pain because the generated Prisma client doesn’t emit nice reusable generic types like MySqlTableWithColumns, and instead emits all of your models as their own concrete TypeScript interface. Because there’s no shared base type, you end up needing to write your own base type instead.

Prisma’s codegen behavior is great for IntelliSense speed as it limits the amount of inference tsc needs to perform (which is a pain point for Drizzle)), but makes writing abstractions like newId harder.

FAQ

My database supports stored procedures, do I need this?

Not necessarily. You could instead set the default value of your column to something like CONCAT('prefix_', generate_an_id()) and not have to worry about solving this problem in app code, and that’s a perfectly fine solution. It will last you a long time, and it’s actually quite a bit simpler than writing a newId function (although I still think you should leverage $type to benefit from branded types!)

At truly large scale it becomes beneficial to generate IDs inside your app code rather than relying on the DB because doing so allows you to leverage things like Postgres’ pipeline mode. If you’re performing lots of small writes on complex object graphs, then pipeline mode can make your database scale a lot more than you’d otherwise think.

It’s also generally easier to scale your application servers than it is to scale your database. Generating the ID inside your application code saves you a tiny bit of database compute, which might be significant when operating at scale.

Odds are good that your business won’t actually hit the point where this matters, though. If you’re working with a database that supports stored procedures (or that has a built-in function for generating your preferred ID format) then I recommend using default column values at the schema level. It keeps things simple.

« Newer post

Shadowing isn’t so bad

Older post »

Managing your AWS organization in Terraform

Step 1: Brand your ID fields

Step 2: Write a function for generating IDs

FAQ

My database supports stored procedures, do I need this?

« Newer post

Older post »

Get in touch 👋