Reputation: 3475
I've recently been debugging a problem with an application that parses RSS feeds, and encountered some behaviour in a third-party RSS feed which seems rather odd to me. I would be interested in knowing whether this behaviour is legitimate (per the HTTP RFCs), or even widespread, and also whether it's regarded as good practice.
When I do the following HTTP request to the RSS feed URL, I always get an RSS 2.0 feed with 20 items.
GET /news.rss HTTP/1.1
Host: thesite.com
However, if I do the following conditional GET request using the If-Modified-Since
header, I get a smaller RSS file with fewer than 20 items – just those with an RSS <pubDate>
equal or after the If-Modified-Since
header.
GET /news.rss HTTP/1.1
Host: thesite.com
If-Modified-Since: Mon, 10 Feb 2025 00:00:00 GMT
The parser I'm debugging makes the assumption that the conditional GET request will either return the full set of 20 items, or else will return a 304 Not Modified. At some level, I need to fix the parser as it's an unnecessary assumption and the RSS feed is not under my control, but it would be useful to know whether this behaviour is legal.
The nearest I can find is §13.1.3 of RFC 9110 which says:
An origin server that evaluates an If-Modified-Since condition SHOULD NOT perform the requested method if the condition evaluates to false; instead, the origin server SHOULD generate a 304 (Not Modified) response, including only those metadata that are useful for identifying or updating a previously cached response.
That's not as clear as I might like – perhaps intentionally – but seems to imply the request will either return the same data as an unconditional request, or else return a 304. Unsurprisingly, the RSS 2.0 spec has nothing to say on the subject.
Outside of RSS, I've seen it once or twice in the REST APIs for a piece of accounting software, but not more widely. Is it good practice for an HTTP server to handle conditional requests in this way?
Upvotes: 2
Views: 24
Reputation: 49072
I agree with you: this is surprising, though it doesn't appear to be explicitly forbidden.
This seems to be a clever attempt to allow you to fetch all entries after a given date, a capability that I think is not provided by the RSS spec? You'd normally implement that kind of filter through your API (using a query fragment in the URL, perhaps) but here they're reusing the If-Modified-Since
header for that purpose.
The specification does allow you to use headers to select a representation of a resource, but I think it's a bad idea to reuse this specific header—which has a defined, completely different purpose—for this. For one thing, it violates client expectations, as you discovered.
Upvotes: 2