Simon
Simon

Reputation: 25983

How do I turn an RSS feed back into RSS?

According to the feedparser documentation, I can turn an RSS feed into a parsed object like this:

import feedparser
d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')

but I can't find anything showing how to go the other way; I'd like to be able do manipulate 'd' and then output the result as XML:

print d.toXML()

but there doesn't seem to be anything in feedparser for going in that direction. Am I going to have to loop through d's various elements, or is there a quicker way?

Upvotes: 5

Views: 925

Answers (4)

dbr
dbr

Reputation: 169573

Appended is a not hugely-elegant, but working solution - it uses feedparser to parse the feed, you can then modify the entries, and it passes the data to PyRSS2Gen. It preserves most of the feed info (the important bits anyway, there are somethings that will need extra conversion, the parsed_feed['feed']['image'] element for example).

I put this together as part of a little feed-processing framework I'm fiddling about with.. It may be of some use (it's pretty short - should be less than 100 lines of code in total when done..)

#!/usr/bin/env python
import datetime

# http://www.feedparser.org/
import feedparser
# http://www.dalkescientific.com/Python/PyRSS2Gen.html
import PyRSS2Gen

# Get the data
parsed_feed = feedparser.parse('http://reddit.com/.rss')

# Modify the parsed_feed data here

items = [
    PyRSS2Gen.RSSItem(
        title = x.title,
        link = x.link,
        description = x.summary,
        guid = x.link,
        pubDate = datetime.datetime(
            x.modified_parsed[0],
            x.modified_parsed[1],
            x.modified_parsed[2],
            x.modified_parsed[3],
            x.modified_parsed[4],
            x.modified_parsed[5])
        )

    for x in parsed_feed.entries
]

# make the RSS2 object
# Try to grab the title, link, language etc from the orig feed

rss = PyRSS2Gen.RSS2(
    title = parsed_feed['feed'].get("title"),
    link = parsed_feed['feed'].get("link"),
    description = parsed_feed['feed'].get("description"),

    language = parsed_feed['feed'].get("language"),
    copyright = parsed_feed['feed'].get("copyright"),
    managingEditor = parsed_feed['feed'].get("managingEditor"),
    webMaster = parsed_feed['feed'].get("webMaster"),
    pubDate = parsed_feed['feed'].get("pubDate"),
    lastBuildDate = parsed_feed['feed'].get("lastBuildDate"),

    categories = parsed_feed['feed'].get("categories"),
    generator = parsed_feed['feed'].get("generator"),
    docs = parsed_feed['feed'].get("docs"),

    items = items
)


print rss.to_xml()

Upvotes: 7

Jon Cage
Jon Cage

Reputation: 37458

If you're looking to read in an XML feed, modify it and then output it again, there's a page on the main python wiki indicating that the RSS.py library might support what you're after (it reads most RSS and is able to output RSS 1.0). I've not looked at it in much detail though..

Upvotes: 1

Jon Cage
Jon Cage

Reputation: 37458

As a method of making a feed, how about PyRSS2Gen? :)

I've not played with FeedParser, but have you tried just doing str(yourFeedParserObject)? I've often been suprised by various modules that have str methods to just output the object as text.

[Edit] Just tried the str() method and it doesn't work on this one. Worth a shot though ;-)

Upvotes: 0

Andrea Ambu
Andrea Ambu

Reputation: 39516

from xml.dom import minidom

doc= minidom.parse('./your/file.xml')
print doc.toxml()

The only problem is that it do not download feeds from the internet.

Upvotes: 0

Related Questions