Reputation: 15371
Having great luck working with single-source feed parsing in Universal Feed Parser, but now I need to run multiple feeds through it and generate chronologically interleaved output (not RSS). Seems like I'll need to iterate through URLs and stuff every entry into a list of dictionaries, then sort that by the entry timestamps and take a slice off the top. That seems do-able, but pretty expensive resource-wise (I'll cache it aggressively for that reason).
Just wondering if there's an easier way - an existing library that works with feedparser to do simple aggregation, for example. Sample code? Gotchas or warnings? Thanks.
Upvotes: 0
Views: 1337
Reputation: 33200
Here is already suggestion to store data in the database, e.g. bsddb.btopen()
or any RDBMS.
Take a look at heapq.merge()
and bisect.insort()
or use one of B-tree implementations if you'd like to merge data in memory.
Upvotes: 1
Reputation: 15824
You could throw the feeds into a database and then generate a new feed from this database.
Consider looking into two feedparser-based RSS aggregators: Planet Feed Aggregator and FeedJack (Django based), or at least how they solve this problem.
Upvotes: 2