Reputation: 1452
I have list of 200 rss feeds, which I have to downloading. It's continuous process - I have to download every post, nothing can be missing, but also no duplicates. So best practice should be remember last update of feed and control it for change in x-hour interval? And how to handle if downloader will be restarted? So downloader should remember, what were downloaded and dont download it again...
It's somewhere implemented yet? Or any tips for article? Thanks
Upvotes: 1
Views: 1388
Reputation: 6043
Typically this is what you'd want to do:
Upvotes: 4
Reputation: 13860
You can use feedparser to parse the feeds and store in a database the maximal published time per feed.
For a simple database you can use shelve.
Upvotes: 2