Reputation: 26565

How to create a Google Reader?

I need to create a web tool like Google Reader for my college project.

I have 2 question about it:

1) How Google Reader track the read and unread posts ?

2) Google Reader save every post in the db or load the feeds at the moment ?

Upvotes: 1

Answers (4)

Ahmad Azimi

Reputation: 693

You car use Selfoos the new multipurpose rss reader, live stream, mashup, aggregation web application.

Features:

web based rss reader
universal aggregator
open source and free
easy extendable with an open plugin system (write your own data connectors)
mobile support (Android, iOS, iPad)
use selfoss to live stream and collect all your posts, tweets, feeds in one place
lightweight PHP application with less than 2 MB
supports MySQL, PostgreSQL and Sqlite Databases
OPML Import
easy installation: upload and run
with restful json api

Web site: http://selfoss.aditu.de/

GitHub: https://github.com/SSilence/selfoss

Upvotes: 0

sangupta

Reputation: 2406

Not sure if it may help now, but for others who drop by I jotted my thoughts with a detailed design:

Designing a Scalable Google Reader Clone

Upvotes: 2

Piskvor left the building

Reputation: 92792

re #2: Google has a special RSS crawler bot called FeedFetcher. When you request the RSS feed, it's dispatched to retrieve it, and stores the feed into its global (all-user) cache, identified by URL. Next time the feed is requested (even by a different user - as long as the URL matches), it is loaded from the cache.

I'm not sure what the cache invalidation mechanisms are, but the crawler definitely doesn't revisit the feeds strictly as often as the response's Cache-Control mechanisms would indicate (that's probably a good thing, as many generated RSS feeds send no-cache although they don't change too often). This internal cache doesn't seem to persist for longer than a few hours, though.

(these are the hypotheses I formulated some time ago from my RSS feed access logs; I still think they're valid, as I haven't seen any major change in the crawler's behavior since)

Upvotes: 2

Femaref

Reputation: 61497

assign a hash to a single feed post (ie. date+url+??? = hash to identify a single post)
loads them on the fly would be my guess, maybe caches a limited number per user.

Upvotes: 3

How to create a Google Reader?

Answers (4)

Related Questions