Reputation: 22975
The RSS feed URL is available a site's meta data (if one available). Is there a way to extract the feed URL(S) of a page using urllib2
or HTMLParser
modules? Or is there a better module available?
Thanks.
Upvotes: 1
Views: 214
Reputation: 53819
I prefer lxml. It has a very nice API, and it's XPath support makes this fairly simple to accomplish:
import lxml.html
doc = lxml.html.parse(url_to_site)
feeds = doc.xpath('//link[@type="application/rss+xml"]/@href') # list feed urls
Upvotes: 2