GraphQL: From Zero to Schema

What is GraphQL? Why would I want to use it?

At its core, GraphQL is a query language for an API, just as SQL is a query language for databases. What this means is that GraphQL is a specification that allows our clients to query for exactly the data that they need. This eliminates the additional overhead of other solutions, such as calling multiple RESTful endpoints in series.

When building and consuming APIs, traditionally, we think of our data in terms of endpoints. When REST is your API standard, this means that GET /songs will return a list of songs, GET /songs/{id} will return an individual song by ID, POST /songs will create a new song and so on. With GraphQL, this is turned on its head and everything lives at one endpoint, traditionally POST /graphql.

Assume the following endpoints exist in our fake RESTful Songs API:

GET /songs/{id}:

{
    "id": "52151471-6D6F-4C21-91F7-2AAB7A22A751"
    "title": "Feel Good Inc.",
    "duration": 223,
    "artistId": "55844E36-3AA9-4F82-955D-087E97B473DC",
    "albumId": "A027B52B-FC07-4C18-907D-58AD462EFE05"
}

GET /artists/{id}:

{
    "id": "55844E36-3AA9-4F82-955D-087E97B473DC",
    "name": "Gorillaz"
}

When a client of this API wants to host a website showcasing songs along with the artist name, the client is going to need to call both of these endpoints separately and combine the data on the client side. This strategy can be time-consuming and error-prone due to its back-and-forth nature. While distilling the data that we need into composite objects, we throw away more than half of the data that the server returned. We are left with only the song title and artist name. This approach can lead to a poor user experience for a couple of reasons. The first issue with this is that we are over-fetching data, potentially cutting into our user’s data limits. The second issue with this approach is that we are stuck performing these requests in order, one at a time. We cannot query for the artist of the song until we have already retrieved the song because we are not aware of the artist ID until we have the song. GraphQL solves these problems by letting the client query for exactly what they need: one request with no wasted network traffic. A potential GraphQL query for the above example could be:

query SongQuery($songId: ID!) {
    song(id: $songId) {
        title
        artist {
            name
        }
    }
}

Which would result in a response that looks like this:

{
    "song": {
        "title": "Feel Good Inc.",
        "artist": {
            "name": "Gorillaz"
        }
    }
}

As we can see, GraphQL allowed our client to define their query, name it, parameterize it, and ask for exactly the data that they are interested in! If we look at the parameters of this query, we notice that songId is defined to be of type ID!. This definition states that songId must be an Identifier, which could be a String or an Integer. When ID fields are returned, they are always serialized into the response as a string. The exclamation point at the end of the type definition marks it as a non-nullable. Defining this as part of our schema, and by extension, our query, allows these checks to be performed on the client-side in the user’s browser or application. If the data type is incorrect, we don’t need to call out to the server to provide useful error messages to the user. Bonus — this typing does not stop at parameters! This type safety exists at every level within GraphQL. The client will always know the exact data type(s) that the server will be returning, reducing the number of edge cases that may come along with not having robust type-checking coming in or out of our API.

I’ve decided to adopt GraphQL, how do I start?

The first step to adopting GraphQL into your development workflow is planning and defining your schema, which consists of types that can be surfaced through your API via queries and actions that can be taken through mutations. GraphQL is the most successful when the business processes are outlined rather than trying to stick strictly to the standard CRUD operations. What this looks like in practice is embracing the modification of multiple resources in a single API call. This could mean that the publishSong mutation below is extended to allow users to create artists and albums if they do not already exist.

For the example above, the following schema could exist:

type Query {
    song($id: ID!): Song
}

type Mutation {
    publishSong($song: SongInput!): Song!
}

type Song {
    id: ID!
    title: String!
    duration: Int!
    artist: Artist
    album: Album
}

input SongInput {
    title: String!
    duration: Int!
    artistId: ID
    albumId: ID
}

type Artist {
    id: ID!
    name: String!
}

type Album {
    id: ID!
    name: String!
    releaseDate: String!
}

As we can see in this schema example, GraphQL has its own syntax; let’s break it down. A type is defined with a type Name as well as a collection of fields that can be resolved off of the type. As we can see in the Query type, where the data fetching resolvers to our graph exist, fields may also have arguments assigned to them by using parenthesis and defining them as if they were function parameters. Finally, we see that every field is given a type, which leads to the inherent type safety mentioned earlier. These field types must be either another type in your schema or a scalar. Scalars included by default in GraphQL include the following: String, Int, Float, ID, and Boolean. Because fields must be defined in our schemas and they must have strict types defined, objects with unknown responses cannot easily be modelled in GraphQL.

To use an object as your input type to a field, it must be defined as an input type. This is much like a standard type with a couple of extra limitations:

You cannot include arguments to the field of an input
You cannot return an input type from your graph

Outside of these two limitations, they are defined and can be treated like any other type.

Relationships

While creating your GraphQL schema, relationships between types are preferred over having foreign keys surfaced in your types. The power of GraphQL comes into play when requests are combined and built out by the client for their specific use-case. Surfacing these foreign keys only tempts those new to GraphQL; they may query for that value and then for a related entity by ID. Instead, a relationship should be formed so that we can ask for the Artist that the song belongs to directly in our query.

Naming Conventions

Type, input, and enum names should be defined with UpperCamelCase names, also referred to as PascalCase. UpperCamelCase means that the type should begin with a capitalized letter and every word within the type name should also be capitalized.

Within types, we have fields, which are defined using lowerCamelCase naming, also referred to as simply camelCase. The difference between upper and lower camel case is that lower camel case should instead start with a lowercase letter, but every other word should still be capitalized.

In Closing

In the next post in this series, we will be exploring what it takes to write an implementation for this GraphQL API using javascript to surface songs and their artists. We will be using apollo server to surface the API. More GraphQL resources, such as client and server libraries for other languages, can be found in the awesome-graphql repository on Github.

〈 Back to Blog