Jérôme
Jérôme

Reputation: 14664

Computing an ETag for a REST API

We're building REST APIs in which we use ETag for two uses:

  1. Save bandwidth by allowing the client to avoid reloading a resource (not that important to us)
  2. Address concurrency issues (lost update problem)

From a practical perspective, I'm wondering what to use to compute the ETag.

The first approach (compute item hash) seems appropriate for case 2 concurrency issues. The second approach (compute payload hash, including metadata, headers) would be appropriate for case 1 save bandwidth.

Putting every bit of the response (including headers) in the request seems right, as every change there may be relevant and require the client to refresh its cache. But I don't know how to manage concurrency on PUT or DELETE requests with such an ETag.

From a practical perspective, should we use item hash or response hash and how can we handle both cases with one of them?

Upvotes: 2

Views: 2988

Answers (1)

Kevin Christopher Henry
Kevin Christopher Henry

Reputation: 48902

Given your description I think the response hash is the only one that makes sense here.

First, in order to use conditional requests to avoid the lost update problem, the validators need to be strong.

An origin server MUST use the strong comparison function when comparing entity-tags for If-Match (Section 2.3.2), since the client intends this precondition to prevent the method from being applied if there have been any changes to the representation data.

Strong validators can only have the same value when the representations are bit-for-bit identical. But if, as you say, "additional data may matter" beyond the item hash, then you are not in a position to decide on a strong ETag at that time. So you simply could not do an item hash and be consistent with the specification in that case.

Of course, you could decide that additional data does not matter, in which case you could still do the item hash and be consistent with the specification. But that obviates the one downside you gave for the response hash idea ("we can't just pull the item from DB to check the ETag as we don't have the extra data").

Put differently: you need a strong ETag to avoid lost updates, and strong validators must change "whenever a change occurs to the representation data that would be observable in the payload body of a 200 (OK) response to GET." So to construct the ETag you have to know everything you would know to respond to a GET in any case, so there's no downside to doing it in the response layer.

Upvotes: 2

Related Questions