Reputation: 26565
I need to create a web tool like Google Reader for my college project.
I have 2 question about it:
1) How Google Reader track the read and unread posts ?
2) Google Reader save every post in the db or load the feeds at the moment ?
Upvotes: 1
Views: 834
Reputation: 693
You car use Selfoos the new multipurpose rss reader, live stream, mashup, aggregation web application.
Features:
Web site: http://selfoss.aditu.de/
GitHub: https://github.com/SSilence/selfoss
Upvotes: 0
Reputation: 2406
Not sure if it may help now, but for others who drop by I jotted my thoughts with a detailed design:
Designing a Scalable Google Reader Clone
Upvotes: 2
Reputation: 92792
re #2: Google has a special RSS crawler bot called FeedFetcher. When you request the RSS feed, it's dispatched to retrieve it, and stores the feed into its global (all-user) cache, identified by URL. Next time the feed is requested (even by a different user - as long as the URL matches), it is loaded from the cache.
I'm not sure what the cache invalidation mechanisms are, but the crawler definitely doesn't revisit the feeds strictly as often as the response's Cache-Control
mechanisms would indicate (that's probably a good thing, as many generated RSS feeds send no-cache
although they don't change too often). This internal cache doesn't seem to persist for longer than a few hours, though.
(these are the hypotheses I formulated some time ago from my RSS feed access logs; I still think they're valid, as I haven't seen any major change in the crawler's behavior since)
Upvotes: 2
Reputation: 61497
Upvotes: 3