Reputation: 4237
I want to create a RSS Feed generator. The generator is supposed to read data from my database and generate RSS feeds for them. I want to write a script that would automatically generate the xml
file for validation.
So if a typical xml
file for RSS feed looks like this:
<item>
<title>Entry Title</title>
<link>Link to the entry</link>
<guid>http://example.com/item/123</guid>
<pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
<description>[CDATA[ This is the description. ]]</description>
</item>
I want the script to replace automatically the field between the <item>
and the </item>
tag.
Similarly for all the tags. The value for the tags will be fetched from the database. So I will query for it. My app has been designed in Django.
I am looking for suggestions as to how to do this in python. I am also open to any other alternatives if my idea is vague.
Upvotes: 7
Views: 17408
Reputation: 1935
A more modern option which has not been mentioned yet is the python-feedgen
library. It can be used to generate web feeds in both ATOM and RSS format and has support for extensions (eg. Podcasts).
The latest documentation is available on their website and an extract for v0.9 (which may be outdated at the time you are reading this) has been included here for reference:
from feedgen.feed import FeedGenerator
fg = FeedGenerator()
fg.id('http://lernfunk.de/media/654321')
fg.title('Some Testfeed')
fg.author( {'name':'John Doe','email':'[email protected]'} )
fg.link( href='http://example.com', rel='alternate' )
fg.logo('http://ex.com/logo.jpg')
fg.subtitle('This is a cool feed!')
fg.link( href='http://larskiesow.de/test.atom', rel='self' )
fg.language('en')
fe = fg.add_entry()
fe.id('http://lernfunk.de/media/654321/1')
fe.title('The First Episode')
fe.link(href="http://lernfunk.de/feed")
The FeedGenerator's method add_entry()
will generate a new FeedEntry
object, automatically append it to the feeds internal list of entries and return it, so that additional data can be added.
After that you can generate both RSS or ATOM by calling the respective method:
atomfeed = fg.atom_str(pretty=True) # Get the ATOM feed as string
rssfeed = fg.rss_str(pretty=True) # Get the RSS feed as string
fg.atom_file('atom.xml') # Write the ATOM feed to a file
fg.rss_file('rss.xml') # Write the RSS feed to a file
Upvotes: 1
Reputation: 4237
Since this is python. Its good to be using PyRSS2Gen
. Its easy to use and generates the xml really nicely. https://pypi.python.org/pypi/PyRSS2Gen
Upvotes: 10
Reputation: 3098
This rough code uses a very useful library called BeautifulSoup. It serves as a parser for HTML/XML markup languages providing a python object model. It can be used to extract/change information from XML or create XML by building up the object model.
This code below works by changing XML tags from a the template on pastebin. It does this by copying the template "item" tag, clearing its child tags and filling them with the suitable dictionary entries which will be presumably sourced from a database.
# Import System libraries
from copy import copy
from copy import deepcopy
# Import Custom libraries
from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup, Tag, NavigableString, CData
def gen_xml(record):
description_blank_str = \
'''
<item>
<title>Entry Title</title>
<link>Link to the entry</link>
<guid>http://example.com/item/123</guid>
<pubDate>Sat, 9 Jan 2010 16:23:41 GMT</pubDate>
<description>[CDATA[ This is the description. ]]</description>
</item>
'''
description_xml_tag = BeautifulStoneSoup(description_blank_str)
key_pair_locations = \
[
("title", lambda x: x.name == u"title"),
("link", lambda x: x.name == u"link"),
("guid", lambda x: x.name == u"guid"),
("pubDate", lambda x: x.name == u"pubdate"),
("description", lambda x: x.name == u"description")
]
tmp_description_tag_handle = deepcopy(description_xml_tag)
for (key, location) in key_pair_locations:
search_list = tmp_description_tag_handle.findAll(location)
if(not search_list):
continue
tag_handle = search_list[0]
tag_handle.clear()
if(key == "description"):
tag_handle.insert(0, CData(record[key]))
else:
tag_handle.insert(0, record[key])
return tmp_description_tag_handle
test_dict = \
{
"title" : "TEST",
"link" : "TEST",
"guid" : "TEST",
"pubDate" : "TEST",
"description" : "TEST"
}
print gen_xml(test_dict)
Upvotes: 1
Reputation: 52000
What have you tried so far?
A very basic approach would be to use the Python DB API to query the database and perform some simple formatting using str.format
(I type directly in SO -- so beware of typos):
>>> RSS_HEADER = """
<whatever><you><need>
"""
>>> RSS_FOOTER = """
</need></you></whatever>
"""
>>> RSS_ITEM_TEMPLATE = """
<item>
<title>{0}</title>
<link>{1}</link>
</item>
"""
>>> print RSS_HEADER
>>> for row in c.execute('SELECT title, link FROM MyThings ORDER BY TheDate'):
print RSS_ITEM_TEMPLATE.format(*row)
>>> print RSS_FOOTER
Upvotes: 3