25 April, 2024•5 minute read

Why does GraphQL have separate input and output types?

There are a few languages out there which come with multiple “types” of object, but GraphQL may just have the most interesting implementation of this idea. The only difference between a struct and a class in C++ is the default access level of their members, but in GraphQL there are significant semantic differences between objects defined using the type and input keywords. This aspect of the type system can be surprising for newcomers to the tech.

Object types defined using the input keyword may only be used to type field arguments, while type objects are used everywhere else in your schema. In this article, an “input object type” will mean an object defined with input while an “output object type” will mean an object defined with type.

Why does GraphQL have two different kinds of objects, complicating what is an otherwise simple type system?

Input and output objects break differently

…and that’s really all there is to it. GraphQL has a hard split between input and output type definitions because adding and removing fields causes different kinds of breaking changes in each circumstance. The number one guiding principle for GraphQL’s design is to make it as easy as possible for forward evolution to occur, and one of the most powerful tools available to the RFC authors for enabling this is to design the language in a way that makes introducing breaking changes to a GraphQL schema more difficult.

In fact, there are only two situations where type changes behave identically in each kind of object type:

Adding a nullable field to either an input or output type in GraphQL is safe, and won’t break existing consumers.
Deleting a field—nullable or not—is always dangerous, and will always break consumers which use or provide that field.

But in every single other situation, input and output objects have divergent behavior. Let’s go through each possibility.

Adding a non-nullable field

This always breaks consumers when the non-nullable field is added to an input type. GraphQL doesn’t allow consumers to provide unknown input arguments to resolvers—there is no equivalent to OpenAPI’s additionalProperties field—which means it is impossible for a query document to pass type checks both before and after the addition of a non-null field on an input type.

On the other hand, consumers of an output type are fine. GraphQL requires that consumers explicitly request all fields they want to be returned from the server, so adding a new field to an output type is basically an invisible change as far as existing consumers are concerned.

Changing a non-nullable field to nullable

Here input types are fine; relaxing the non-null constraint is effectively just “widening” the type. All values being provided by consumers today already passed type checks with the stricter non-null type, so they will continue to type check under the more forgiving nullable type.

Output types are a different story. At the schema level things look OK, and query documents that are valid today will still be valid after we deploy this schema change. But in this situation we are vulnerable to a different failure mode: consumers written to target the old schema have likely not been written in a null-safe manner, because they didn’t need to handle null values. Now that it is legitimate for our server to return nulls, we risk the consumer dereferencing a null pointer and running into trouble that way.

Making a nullable field required

This is similar to the case of adding a non-nullable field: input type consumers can break, while output type consumers are fine.

The difference is in the details. Where adding a new non-nullable field to an input type will always break consumers, tightening up a nullable field only usually breaks consumers. Cases where the consumer is already passing non-null values and where operation variable definitions were already being typed by the consumer as non-null will continue to typecheck and work as expected. Where one—or both—of these prerequisites do not hold, things will break.

Output types are fine; they will have been written to include null checks, and are now free to drop them as they are unnecessary.

GraphQL’s guiding philosophy

GraphQL is designed from the ground up around a vision of APIs that evolve forever with no breaking changes. This is an ambitious goal, and it bleeds through to almost every part of the language. If you ever find yourself wondering why GraphQL works in a certain manner, it can almost always be explained in terms of this central premise.

The input and output object type split is a particularly good example of this design philosophy in action. If it were possible to use a single object type for both input and output purposes it would be extremely easy to back yourself into a corner!

When you are down in the weeds trying to push a feature out on time, it is easy to make bad decision in the moment which goes on to cause problems. Reusing a DTO for both a request and response body is a very common footgun when building RESTful APIs and gRPC services, but GraphQL makes doing this completely impossible.

Here are some other design decisions GraphQL has which enable forwards evolution and make introducing backwards-incompatible changes harder:

No positional arguments. You can safely reorder any field in GraphQL and not break any clients because arguments are always referred to by name. Candidly, I wish this feature was more commonplace in programming languages because positional arguments tend to be quite frustrating.
No extraneous arguments. Passing an unrecognized parameter to a REST API will generally work, depending on who implemented the API. In GraphQL this never works and always results in a type error, because otherwise it would be impossible to add new input fields for fear of breaking old clients sending that field and relying on it being a no-op.
Nullable types by default. Adding a nullable field is backwards-compatible whether it’s part of an input or output type, whereas adding a non-nullable field breaks old clients when added to an input object. The more “dangerous” option in GraphQL is the one that requires the extra keystroke. This is in contrast to other contemporary languages like TypeScript and Kotlin, where the default is inverted and nullability is opt-in.
Explicit selection sets. Arguably the defining feature of GraphQL. By enabling the client to define its own data requirements rather than having the server offer fixed response payloads, you gain a lot of reliability. Adding fields to REST APIs can be tricky when those fields are expensive to compute and also mostly unused, but in GraphQL it is possible to add fields with abandon as you only need to pay for what was requested by the client. You also get fine-grained usage data if your telemetry is set up properly, which means removing deprecated fields is extremely safe.
No recursion. GraphQL fragments cannot recurse on themselves, which means it’s impossible to accidentally request a thousand levels of a tree in one request. You can only request a fixed number of levels in a single request because you need to write each level out by hand, and this helps cap query complexity in a number of cases. Because there’s no recursion, a query which requests a tree of comments from a social media site doesn’t need to worry about the depth of the comment tree and you can rest easy knowing that if your query works today it will probably still work a year from now.

GraphQL is a remarkably well-considered language. At all levels of its design, it has been deliberately architected to help developers fall headfirst into the pit of success.

APIs

GraphQL

« Newer post

Add a base to your multistage Docker build

Older post »

Software reliability happens in depth