4 November, 20245 minute read

GraphQL collection lookups

One of the powerful design patterns that can be employed within a GraphQL schema is the concept of collection lookups. This pattern allows API consumers to retrieve specific elements from a collection using singular fields, enhancing both usability and performance.

To demonstrate this pattern, let’s imagine we’re building a blogging platform (perhaps for sophiabits.com!). The core data type we’ll focus on is a Post which—among other things—shall contain a list of arbitrary key-value pairs and a list of images. Attributes are used by bloggers to store little tidbits of information that could be consumed by a plugin system.

Below is a potential schema. Note the paired attribute/attributes and image/images fields:

Click to copy
scalar URL type Attribute {  key: String!  value: String!} type Image {  src: URL!} type Post {  attribute(key: String!): Attribute  "Arbitrary key-value metadata for this Post"  attributes: [Attribute!]!   image: Image  "List of feature images to display in a carousel"  images: [Image!]!   # ...}

When the singular field of the pair can be used to retrieve individual pieces of data from its corresponding list field, I call it a collection lookup. Collection lookups are a surprisingly versatile design pattern, and they feature in many GraphQL schemas I’ve built. Let’s explore the different “levels” of collection lookup, and think about how we might apply them to our blogging platform.

Types of collection lookup

“First element” lookups

This is the simplest form of collection lookup, and it usually happens accidentally as a result of changing requirements. Say that we originally only supported a single image per post, to be used as a hero image.

If our requirements change and we decide to support displaying a carousel of images for our hero, then we must change our schema. Introducing a new images field is simple enough, but we must also figure out what we want to do with the pre-existing image field. The path of least resistance is to keep it around, and have it function as an alias for images[0].

“By key” lookups

Sometimes a resource contains a collection of items, but as an API consumer you only care about one or two particular elements. The attributes field is a classic example of this. API consumers can stash arbitrary key-value pairs inside this field, and these attributes may be useful for a variety of different purposes. Here are some theoretical attributes we could be interested in storing:

If we want to render a post for display, then we don’t care about attributes that are only used for analytical purposes. While we could filter these out ourselves in code, it would be much better if we were able to request only those attributes we require. GraphQL does not give us this capability natively, but we can implement it ourselves with an attribute(key: String!): Attribute field.

On its own this field doesn’t look so useful, but aliasing enables consumers to fetch all the attributes they care about with a single API request and even give them helpful names:

Click to copy
query GetPostAndSomeAttributes {  post(id: "post_xyz") {    id    title    # fetch 2x attributes we care about    discussOnHnUrl: attribute(key: "discussOnHnUrl") {      value    }    isFeatured: attribute(key: "featured") {      value    }  }}

What I love about this is we are essentially turning these custom attributes into first-class fields on our API object. Instead of the consumer needing to perform a potentially costly linear-time search over the attributes list, they’ll get back an object which looks like the following. It’s really elegant.

Click to copy
{  "data": {    "post": {      "id": "post_xxx",      "title": "My cool post!",      "discussOnHn": {        "value": "https://news.ycombinator.com/..."      },      "isFeatured": {        "value": "false"      }    }  }}

“Search” lookups

An extension of key lookups, why not support looking up collection elements based on non-primary key fields? We could turn our image field into a miniature search engine by allowing consumers to select images by their aspect ratio:

Click to copy
enum ImageLayoutType {  LANDSCAPE  PORTRAIT} type Post {  image(layout: ImageLayoutType): Image  images: [Image!]!  # ...}

You can add as many filters as you’d like. Perhaps you want to support looking up images based on OCRed text content. You can find a real-world example of this pattern in action inside Shopify’s Storefront API. A Shopify Product contains many different Variants, and the Product.variantBySelectedOptions field allows API consumers to find product variants based on their options like size, color, etc.

Search lookups are a great example of how we can push business logic into our GraphQL schema to keep consumer code lean. Once you start incorporating this into your schemas, you’ll find opportunities to apply it almost everywhere. In a previous article, I considered the possibility of a User.inGroup(id: ID!) field, for instance.

The true value of this pattern reveals itself when you begin working with large lists. Imagine if users were able to join thousands of groups: fetching a list of thousands of IDs would be disastrously inefficient if all you cared about was whether the user belonged to one group in particular. In REST we might have been tempted to add another endpoint—and take the hit of an extra network round trip—but in GraphQL we’re able to encode this functionality into the object type itself. As you only pay for what the client explicitly requests, there is no performance hit for your other API consumers.

“Transform” lookups

Now for the final example of this pattern, we’ll make a small tweak to the previous schema:

Click to copy
scalar Percent enum ImageFilterType {  BLUR  GRAYSCALE} type ImageFilter {  type: ImageFilterType!  intensity: Percent} type Post {  image(layout: ImageLayout, filter: ImageFilter): Image  images: [Image!]!  # ...}

The schema still supports looking up an image by its aspect ratio, but I’ve also added support for the consumer to apply some kind of transformation to the image they looked up.

Supporting an image transformation on the images list itself is difficult, because the compute required could be very large if we support large page sizes. It’s far more realistic to support this functionality when it’s scoped to only a single image at a time. It’s still possible for a consumer to retrieve multiple transformed image in a single round trip through aliasing, but it’s also easier for us to guard against expensive queries through query cost limits. It’s hard to apply cost limits to the list field because you can’t know how many elements are inside the list until your server actually starts resolving the field.

Again, this pattern is not limited to the specific example demonstrated here. In addition to image filters, it might make sense to let API consumers resize the image per their requirements, or to let them request images with different quality levels.

The advantage here is that you can shift expensive operations off of low-powered clients and on to your backend. Moving the compute requirements to your own system enables you to power higher-fidelity experiences that would otherwise be impossible to run smoothly on the user’s own device.

Naming considerations

The collection lookup pattern informs our naming decisions elsewhere in the schema. When naming list fields, you want to prefer names which have straightforward singular terms and vice versa. The attribute/attributes and image/images pairs feel good to use because going from the plural to the singular form is very intuitive.

Contrast the term attributes with metadata, which is the field Stripe use for their RESTful API. Because there is no capability for fetching only a subset of metadata fields via REST this works well enough, but in a context where we want to apply the collection lookup pattern the utility of the name metadata is limited. The singular form of “metadata” is “metadatum,” which isn’t as obvious as simply dropping an “s.”

Avoid terms like metadata which don’t have obvious singular/plural pairs. Names like attributes, fields, properties, or tags are all vastly superior choices.

Don't want to miss out on new posts?

Join 100+ fellow engineers who subscribe for software insights, technical deep-dives, and valuable advice.

Get in touch 👋

If you're working on an innovative web or AI software product, then I'd love to hear about it. If we both see value in working together, we can move forward. And if not—we both had a nice chat and have a new connection.
Send me an email at hello@sophiabits.com