Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: rest response sizes #1

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions _posts/2019-03-09-rest-response-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
---
layout: post
title: "Myth: REST Responses Sizes are massive!"
date: 2019-03-04 19:08:05 +0000
categories: jekyll update
---

One of the biggest complaints people have about REST is that the response size is too big. In reality though, that is not a REST problem, that is a problem with the developers maintaining that endpoint. Let's look at some reasons why and how we can mitigate it.
matthewtrask marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh shit very first thing maybe you could say "This argument is a huge misnomer because OData and JSON-API, etc have sparse fieldsets (you can request only the fields you want" which is almost certainly the inspiration for the GraphQL feature.


## Sending too much data

Thoughtfully developing an API endpoint requires some thought into what data is or is not being sent over the wire. While it takes little effort to add a new field to your response object, it does take some thought into _if_ the field should be added in the first place. One of the best ways to mitigate this is by embracing Hypermedia with your APIS. Lets use an example of a users response object before we add Hypermedia.

```json
{
"data": [
{
"type": "User",
"id": 1,
"attributes": {
"userId": "11122",
"name": "Crashy McCrashface",
"position": "Lead Bike Engineer",
"yearsAtPosition": 2,
"previousPosition": "Bike Engineer",
"managerId": "10109"
}
},
{
"type": "User",
"id": 2,
"attributes": {
"userId": "11123",
"name": "Turtle McTurtleface",
"position": "Lead Swamp Designer",
"yearsAtPosition": 5,
"previousPosition": "Junior Swamp Designer",
"managerId": "10002"
}
}
];
}
```

Looking at the data above, there is so much included that doesnt need to be in a ```HTTP/2 GET /users``` request. Lets simplify the whole thing.

```json
{
"data": [
{
"type": "User",
"id": 1,
"attributes": {
"userId": "11122",
"name": "Crashy McCrashface"
},
"relationships": {
"user": {
"links": {
"self": "https://apisyouwonthate.com/api/users/11122/relationships/user",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is just begging for confusion. JSON:API does not always appear to be the most concise or obvious thing in the world, so trying to use this as an example here might just complicate the explanation. I'm not sure there.

Are we trying to move the manager info out?

Perhaps a better example would be invoices. People often mix invoices, with payments and payment attempts, and status of those payments as it traces through the bank, which is confusing as fuck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree.

"related": "https://apisyouwonthate.com/api/users/11122",
},
"data": {
"type": "User",
"id": "11122",
}
}
}
},
{
"type": "User",
"id": 2,
"attributes": {
"userId": "11123",
"name": "Turtle McTurtleface"
},
"relationships": {
"user": {
"links": {
"self": "https://apisyouwonthate.com/api/users/11123/relationships/user",
"related": "https://apisyouwonthate.com/api/users/11123",
},
"data": {
"type": "User",
"id": "11123",
}
}
}
}
];
}
```

While this has more structure to the request, the data being sent back has been cut down and we give the client user instructions to where to find the data they may want next. This is called "discoverability". Now, if we request ```HTTP/2 GET /users/11123``` we would want to return something similar to the first example, like so:

```json
{
"data": [
{
"type": "User",
"id": 1,
"attributes": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you keep this example then maybe split it out into "positions" or "roles" and explain that the extra split is actually a useful feature: you can have multiple roles at the company, you may switch roles, slowly transitin from one to the other with overlap, etc. Usually the extra resources provide extra functionality. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im going to go with the invoice example, seems more straight forward :)

"userId": "11122",
"name": "Crashy McCrashface",
"position": "Lead Bike Engineer",
"yearsAtPosition": 2,
"previousPosition": "Bike Engineer",
"managerId": "10109"
}
}
];
}
```.

### Wrongly modeled data.

In conjunction with the above, another reason people complain about REST response sizes being so large is that developers have not modeled the data for the requests. Lets consider the example above. In it, we have two requests we are making, ```/api/users``` and ```/api/users/11122```. In the first request, we should return the least amount of data possible. When making the first GET request, which returns all the resources for the given endpoint, model your data to return the bare minimum. In this case, a userId and a name.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have not modeled the data for the requests

not sure what this means


The idea here is instead of racing to code, take a step back and see what data is truly needed for the endpoint. If you have a users page, an organization chart let's just say, is it useful to have data surrounding their time at the company? Or maybe previous titles? That data is not useful in this case, so instead lets model our data to only return the following:

```json
{
"data": [
{
"type": "User",
"id": 1,
"attributes": {
"userId": "11122",
"name": "Crashy McCrashface",
"position": "Lead Bike Engineer",
"manager": "Stowford Turtle"
}
}
];
}
```

And that's all you need! Along with hypermedia, the other data is close by but now our response is more concise and easier to use.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add that "the other benefit to having smaller payloads with links for more data to grab if the client wants it" is "these smaller more targeted resources might be http cached, meaning when your client goes to fetch more data avout crashy's positions, its right there in the HTTP client cache. =Insert link to article=.