Indradhanush Gupta
Indradhanush Gupta

Reputation: 4237

How to dynamically generate XML file for RSS Feed?

I want to create a RSS Feed generator. The generator is supposed to read data from my database and generate RSS feeds for them. I want to write a script that would automatically generate the xml file for validation.

So if a typical xml file for RSS feed looks like this:

  <item>
  <title>Entry Title</title>
  <link>Link to the entry</link>
  <guid>http://example.com/item/123</guid>
  <pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
  <description>[CDATA[ This is the description. ]]</description>
  </item>

I want the script to replace automatically the field between the <item> and the </item> tag. Similarly for all the tags. The value for the tags will be fetched from the database. So I will query for it. My app has been designed in Django.

I am looking for suggestions as to how to do this in python. I am also open to any other alternatives if my idea is vague.

Upvotes: 7

Views: 17408

Answers (4)

creme332
creme332

Reputation: 1935

A more modern option which has not been mentioned yet is the python-feedgen library. It can be used to generate web feeds in both ATOM and RSS format and has support for extensions (eg. Podcasts).

The latest documentation is available on their website and an extract for v0.9 (which may be outdated at the time you are reading this) has been included here for reference:

Create a feed

from feedgen.feed import FeedGenerator

fg = FeedGenerator()
fg.id('http://lernfunk.de/media/654321')
fg.title('Some Testfeed')
fg.author( {'name':'John Doe','email':'[email protected]'} )
fg.link( href='http://example.com', rel='alternate' )
fg.logo('http://ex.com/logo.jpg')
fg.subtitle('This is a cool feed!')
fg.link( href='http://larskiesow.de/test.atom', rel='self' )
fg.language('en')

Add feed entries

fe = fg.add_entry()
fe.id('http://lernfunk.de/media/654321/1')
fe.title('The First Episode')
fe.link(href="http://lernfunk.de/feed")

The FeedGenerator's method add_entry() will generate a new FeedEntry object, automatically append it to the feeds internal list of entries and return it, so that additional data can be added.

Generate feed

After that you can generate both RSS or ATOM by calling the respective method:

atomfeed = fg.atom_str(pretty=True) # Get the ATOM feed as string
rssfeed  = fg.rss_str(pretty=True) # Get the RSS feed as string
fg.atom_file('atom.xml') # Write the ATOM feed to a file
fg.rss_file('rss.xml') # Write the RSS feed to a file

Upvotes: 1

Indradhanush Gupta
Indradhanush Gupta

Reputation: 4237

Since this is python. Its good to be using PyRSS2Gen. Its easy to use and generates the xml really nicely. https://pypi.python.org/pypi/PyRSS2Gen

Upvotes: 10

dilbert
dilbert

Reputation: 3098

This rough code uses a very useful library called BeautifulSoup. It serves as a parser for HTML/XML markup languages providing a python object model. It can be used to extract/change information from XML or create XML by building up the object model.

This code below works by changing XML tags from a the template on pastebin. It does this by copying the template "item" tag, clearing its child tags and filling them with the suitable dictionary entries which will be presumably sourced from a database.

# Import System libraries
from copy import copy
from copy import deepcopy

# Import Custom libraries
from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup, Tag, NavigableString, CData

def gen_xml(record):

    description_blank_str = \
    '''
    <item>
    <title>Entry Title</title>
    <link>Link to the entry</link>
    <guid>http://example.com/item/123</guid>
    <pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
    <description>[CDATA[ This is the description. ]]</description>
    </item>
    '''
    description_xml_tag = BeautifulStoneSoup(description_blank_str)

    key_pair_locations = \
    [
        ("title", lambda x: x.name == u"title"),
        ("link", lambda x: x.name == u"link"),
        ("guid", lambda x: x.name == u"guid"),
        ("pubDate", lambda x: x.name == u"pubdate"),
        ("description", lambda x: x.name == u"description")
    ]

    tmp_description_tag_handle = deepcopy(description_xml_tag)

    for (key, location) in key_pair_locations:
        search_list = tmp_description_tag_handle.findAll(location)
        if(not search_list):
            continue
        tag_handle = search_list[0]
        tag_handle.clear()
        if(key == "description"):
            tag_handle.insert(0, CData(record[key]))
        else:
            tag_handle.insert(0, record[key])

    return tmp_description_tag_handle

test_dict = \
{
    "title" : "TEST",
    "link" : "TEST",
    "guid" : "TEST",
    "pubDate" : "TEST",
    "description" : "TEST"
}
print gen_xml(test_dict)

Upvotes: 1

Sylvain Leroux
Sylvain Leroux

Reputation: 52000

What have you tried so far?

A very basic approach would be to use the Python DB API to query the database and perform some simple formatting using str.format (I type directly in SO -- so beware of typos):

>>> RSS_HEADER = """
<whatever><you><need>
"""
>>> RSS_FOOTER = """
</need></you></whatever>
"""

>>> RSS_ITEM_TEMPLATE = """
<item>
    <title>{0}</title>
    <link>{1}</link>
</item>
"""


>>> print RSS_HEADER

>>> for row in c.execute('SELECT title, link FROM MyThings ORDER BY TheDate'):
        print RSS_ITEM_TEMPLATE.format(*row)

>>> print RSS_FOOTER

Upvotes: 3

Related Questions