I am building a REST API where clients can query user-sent messages, like this:
GET http://example.com/api/v1/messages?from=0&to=100
Response:
[
{
"id": 12345,
"text": "Hello, world!"
},
{
"id": 12346,
"text": "Testing, testing"
},
...
]
Now, I need to include the name of the user who sent the message. I can think of 4 ways to store this information, but I can’t decide which is best:
Option 1:
[
{
"id": 12345,
"sender_id": 16,
"text": "Hello, world!"
}
]
This method is the most efficient for large scale – if the client queries the API many times, they can cache a map of user ID to name and reuse that. However, for one-off queries, it doubles the amount of API calls that the client would have to perform (once for the message list, and another to find the name for a given user ID).
Option 2:
[
{
"id": 12345,
"sender_name": "John Smith",
"text": "Hello, world!"
}
]
This method is simplest for client consumption, but if the API ever needs to be changed to include the sender ID (e.g. for linking from message to user), I would need to have two “sender” fields in the message object, sender_id
and sender_name
, which is essentially a worse version of option 3.
Option 3:
[
{
"id": 12345,
"sender": {
"id": 16,
"name": "John Smith"
},
"text": "Hello, world!"
}
]
This approach embeds the sender object into each message, which makes it future-proof and only requires a single API call per query. However, it adds a lot of redundant information if there are many messages and few users (for example, querying all the messages sent by a single user).
Option 4:
{
"users": [
{
"id": 16,
"name": "John Smith"
},
...
],
"messages": [
{
"id": 12345,
"sender_id": 16,
"text": "Hello, world!"
}
]
}
This solves the redundancy problem with #3, but it adds a lot of complexity to the structure of the response object.