Reputation: 357
The Github API specifies two headers that can be used in Conditional Requests, Last-Modified
and ETag
. Which is the more reliable when querying the API?
For context: when using the api endpoint GET /repos/:owner/:repo/git/trees/:sha
on each subdir of a large repo, every response contains the same last-modified
value (even though the repo on github shows different authored dates) while the etag
value for each is different. I'm wondering if the ETag
is a more granular representation of repo content state change (for caching purposes).
Upvotes: 6
Views: 3062
Reputation: 1327224
Reading "ETags: a pretty sweet feature of HTTP 1.1", it says:
"ETags allow dynamic content to be cached using an app-specific "opaque token""
An ETag, or entity tag, is an opaque token that identifies a version of the component served by a particular URL. The token can be anything enclosed in quotes; often it's an md5 hash of the content, or the content's VCS version number.
If the content of the answer is the same, the ETag should be identical everytime.
I just tested it with https://api.github.com/repos/VonC/gopanic/git/trees/master, and indeed its ETag remains W/"34a03ea1d4dc0b5d533ecf8d36492879"
even when called repeatedly.
But should I get the tree for each subfolder, then the ETag would vary because it represents a signature of the different response content.
The advantage of ETag is that it doesn't depend on a date (whose clock might vary for diverse reason), but on the content of the answer: if unchanged, it
Warning: Brice notes in the comments:
The
etag
value is specific to the server, for example GitHub might use the hash of the blob, but maybe, not always.
Other providers may not even do that, e.g. Apache was using theinode
foretags
in the past.
Upvotes: 7
Reputation: 9875
Unfortunately, at least in its state on Aug 1, 2019, GitHub API's and for /releases/latest
endpoint, ETag
is not giving consistent values.
Most of the time it will give you consistent non-changing ETag
value, but randomly (sometimes often) the ETag
will be different. See my examples, I ran an API call couple times:
curl -IsL -H "Accept-Encoding: gzip" https://api.github.com/repos/mautic/mautic/releases/latest | grep -P '^(HTTP/|ETag:|X-RateLimit-Remaining|Last-Mod)'
First result:
HTTP/1.1 200 OK
X-RateLimit-Remaining: 56
ETag: W/"b86f015c353e7c1d773f1f781d4cf822"
Last-Modified: Mon, 25 Mar 2019 23:14:15 GMT
Some times later:
HTTP/1.1 200 OK
X-RateLimit-Remaining: 59
ETag: W/"9f670edf97e04c5c23cce74457be61a3"
Last-Modified: Mon, 25 Mar 2019 23:14:15 GMT
Note how Last-Modified
stays intact, so doing conditional GET
only using that header will result in better API cacheability in comparison to ETag
.
Upvotes: 5