Reputation: 2485
Do all RSS feeds support HTTP ETags/last-modified headers for indicating whether the feed has been updated?
For feeds which do not include last-modified headers, what is the best way of determining how often the feed updates?
I'm hoping to tailor the number of requests I send to each feed based on its update frequency to cut down on bandwidth (following ~2k feeds...)
Upvotes: 3
Views: 2172
Reputation: 2075
To find new items compare the items of the retrieved feed with those found earlier. If the items have a GUID use that for the comparison, otherwise you could combine fields like link + title, or keep a MD5 hash of the complete item.
Use this knowledge to adapt the polling interval, like I described in this answer.
Upvotes: 2
Reputation: 33012
No, not all feeds support ETag/If-Modified headers (and unfortunately that does not tell you when to fetch a resource, just that it has not been updated).
There is no general way of determining when a feed has been updated. However, among the most popular methods is the PubSubHubbub protocol, which was designed for that. (It actually goes further and sends you the new content in the feed so you don't even have to fetch it). The problem is that it's not supported by all feeds out there (up to 30% depending on the types of feeds you're dealing with: blogs, news sites, ecommerce... etc).
Another solution is to check http://superfeedr.Com (disclaimer: I created that beast:p) because we'll do all the dirty work for you and you can just sit and wait for us to send you the data (using open protocols).
Upvotes: 2