A Unified Approach to APIs with GraphQL Federation

By Brennan Seymour, Charlie Hall, Colton Sherwood, Sayantani Ghosh | May 16, 2024 | Development

Introduction

At Source Allies, we have a variety of internal services for managing Teammates. Many of the services’ data are related to one other and each consumer needs a different subset of this information. Consumers need to contact a variety of services and process the information before it is usable.

At one point, we switched one backend service out for another. When making the switch, our API structure proved to be brittle. We spent a lot of time retrofitting systems and altering logic to work with this new data source whose contract was only slightly different. There had to be some better development pattern for our distributed architecture.

What is GraphQL?

GraphQL is an API protocol, similar to REST. It allows clients to call a single endpoint for a graph, using a query language to specify exactly what data is needed from that endpoint. This means that in GraphQL clients can request only the data they need, reducing over-fetching and under-fetching. This granular approach to data fetching enhances network efficiency and speeds up client-side development.

Example case

A couple REST requests:

const teammate = fetch(`restful.com/teammates/${id}`);
const teammatePhoneNumbers = fetch(`restful.com/teammates/${id}/phone-numbers`);
teammate.phoneNumbers = teammatePhoneNumbers.Items;

And an equivalent GraphQL request:

const query = `
Query($id: String) {
    teammate(id: $id) {
        name
        email
        phoneNumbers {
            number
        }
    }
}
`);
const variables = { id: id };
const teammate = await post('graph.com', {
      query: query,
      variables: variables,
});

One of the key benefits of GraphQL is its strong typing system and introspection capabilities, which enable clients to discover and understand the API schema dynamically. This self-documenting nature simplifies API exploration and encourages better collaboration between frontend and backend teams.

Moreover, GraphQL promotes versionless APIs by allowing clients to specify the exact shape of the data they require, regardless of changes made to the server-side schema. This flexibility eliminates the need for maintaining multiple API versions, reducing versioning complexities and improving API lifecycle management.

The ability for clients to specify exactly what they need can also reduce excess backend processing. In the above example, if each teammate had a field, expensiveField, which was expensive to look up, you can see that the REST call would have fetched it anyways, though we would never need to use it. Building switching mechanisms into a REST API to conditionally fetch data like this can increase calling complexity significantly.

Overall, GraphQL empowers developers to build more efficient, flexible, and scalable APIs that better accommodate modern applications, resulting in improved productivity, enhanced user experiences, and faster time-to-market for software products.

What is Federation?

GraphQL Federation is an innovative approach to building scalable and efficient APIs. While monolithic GraphQL schemas can seamlessly replace an individual REST API, Federation allows for the creation of a distributed graph architecture where different parts of the schema are owned and maintained independently by various teams or services.

Requesting data from an individual REST API is straightforward. However, if we want to combine information from several REST APIs, we need custom logic for stitching together their responses. At Source Allies, we had similar logic in a number of consumer systems; a change in one system could have a drastic ripple effect. GraphQL Federation eliminates the need for complex consumer-side stitching, duplicate logic, and brittleness. Refactors to APIs within the federation do not require down time and are seamlessly reflected in the graph.

At its core, GraphQL Federation enables composing a unified GraphQL “supergraph” from multiple smaller schemas, each representing a domain or service. These schemas, or “subgraphs,” can be developed, deployed, and scaled independently, promoting autonomy and agility within development teams.

Example Case

Imagine a client that needs to know about a Teammate and all associated projects. Assume these pieces of information are tracked by two different services: hr_api.com and projects_api.com.

The client could simply call both services to get information about a Teammate, and stitch together the results.

const hr_response = fetch(`hr_api.com/teammate/${id}`);
const project_tracker_response = fetch(`projects_api.com/projects_for_teammate/${id}`);

const teammate = {
    name: hr_response.name,
    email: hr_response.email,
    projects: project_tracker_response.items
}

This is all fine and dandy, but as soon as another service pops up (or several) with a more complex response, the work involved to do this begins to increase dramatically:

const hr_response = fetch(`hr_api.com/teammate/${id}`);
const project_tracker_response = await fetch(`projects_api.com/projects_for_teammate/${id}`);
const birthday_response = await fetch(
    `birthday_api.com/teammate/${id}/categories?categories=zodiac,new_year_animal,birthstone`
);
const teammate = {
    name: hr_response.name,
    email: hr_response.email,
    projects: project_tracker_response.items,
    zodiac: birthday_response.results[0]["zodiac"],
    new_year_animal: birthday_response[0]["new_year_animal"],
    birthstone: birthday_response[0]["birthstone"],
}

Unfortunately, there are tons of edge cases in this approach that are improperly handled. A really decent implementation could end up being quite large & complex, and this is work that needs to be done for every different client that needs these fields for a teammate.

In comes Federation. In front of each service, a GraphQL endpoint could be exposed that feeds into a set of federated schema:

## schema.graphql file in the hr api
type Query {
    teammate(id: Int!): Teammate
}

type Teammate {
    id: ID!
    name: String!
    email: String!
}

## schema.graphql file in the project tracker api
# the @key() tells the gateway that "id" can be used as a join key between subgraphs.
type Teammate @key(fields: "id") { 
    id: ID!
    projects: [Project!]!
}
type Project {
    name: String!
    description: String!
}

## schema.graphql file in the birthday categorizer api
type Teammate @key(fields: "id") {
    id: ID!
    zodiac: String!
    newYearAnimal: String!
    birthstone: String!
}

Assuming that GraphQL endpoints are implemented for these three schemata and all three endpoints are published to a gateway, the client code could now perform a relatively simple graph query for all its requirements:

const query = `
Query($id: String) {
    teammate(id: $id) {
        name
        email
        projects {
            name
            description
        }
        zodiac
        newYearAnimal
        birthstone
    }
}
`);

// Of course, a framework or tool could make this easier still.
// We're keeping things simple for the example.
const variables = { id: id };
const teammate = await post('gateway.com', {
    query: query,
    variables: variables,
});

Here, the federated graph is providing a few benefits. First, you can look at it as a replacement for calling three APIs, and stitching the responses together. This in and of itself is a pretty nice thing to offload, as it makes the calling code far simpler and more expressive.

Additionally, if some subgraph serves the desired fields then the client no longer needs to know where the fields come from, decoupling frontend and backend API contracts.

Type Safety

This unified schema also provides centralized type-safety. If one of the backends tries to populate a String field with an Int, for example, GraphQL will throw errors. You can rest assured that whatever types are laid out in the schema are exactly the types your clients will receive.

This was especially useful for us, as we had one enum: employmentType, which was specified as a slightly different set of values in a couple different systems. By specifying it as a GraphQL enum type, we were able to enforce consistency throughout all our systems.

enum employmentType {
    FULL_TIME
	PART_TIME
	CONTRACTOR
    VOLUNTEER
	FURLOUGHED
}

What is the business benefit?

Often times business decisions can lie hidden within complex applications. For Source Allies, it was the question of “What is a Teammate”. We had different definitions of a Teammate through many applications; in some, a Teammate was anyone with an account in the HR system, others, a Teammate was someone who had logged hours in our time-tracking system. This led to data inconsistencies that were incredibly difficult to inspect and reason about.

If you consider alternative solutions for finding errors of this nature, it would require a lengthy audit without long-term guarantees. New errors may arise, and we would be no closer to resolving the root problem. The use of GraphQL Federation required Source Allies to step back and handle these inconsistencies with a bird’s eye view. We solidified the meaning of a teammate to the point that no deviations were possible within the purview of the federation.

When we federated more systems into our supergraph, we uncovered many faulty assumptions and data consistency errors that would have otherwise lain hidden. We can rest assured now, knowing firmly what makes a Teammate, and that all systems must play by those rules.

Conclusion

Through the adoption of GraphQL Federation, we transform the way our services communicate and collaborate. It helps teams work independently, while seamlessly integrating their services. By leveraging a unified schema and standardized data representations, we are able to unlock new levels of efficiency and scalability within our ecosystem. As our architecture continues to evolve, GraphQL Federation remains a cornerstone of our strategy, enabling innovation with confidence and agility.