Reputation: 73
There are a number of different websites that let you monitor specifi web pages for any changes, such as watchthatpage.com or page2rss.com
I'm interested in the way how those sites are working, meaning how do they determine whether some web page is updated. Do they just copy all the text from the page, store it in memory and compare it later to the content of a site's page? Or maybe they look for some specific html elements and compare theirs values?
Please help me to find the answer.
Upvotes: 7
Views: 217
Reputation: 37819
There's two ways this can be done just off the top of my head.
The first is to pull the HTML and do a simple string.compare.
The second way, would be to do a HEAD request See, section 9.4 here
Upvotes: 0
Reputation: 30636
I suspect that they store the entire contents, and every time they check, they compare. If different, send alert, otherwise don't.
Upvotes: 0