anirudha Gupta
anirudha Gupta

Reputation:

parsing application/atom+xml in html page

we know that all blog show his blog 's rss feed by

<link rel="alternate" type="application/rss+xml" title="MyBlog RSS Feed" href="http://feeds.feedburner.com/MyBlog" />

but are you know any regex to get feedurl from this

<link rel="alternate" type="application/rss+xml" title="MyBlog RSS Feed" href="http://feeds.feedburner.com/MyBlog" />

Upvotes: 0

Views: 625

Answers (1)

Welbog
Welbog

Reputation: 60418

Use an XPath query like this one:

//link[@type='application/rss+xml']/@href

It'll pull out any RSS feed URL for you. Never parse XML or HTML with regular expressions, ever. XPath is specifically designed to query XML and HTML easy for you. It's available in nearly every technology stack, including .NET.

XML is not regular, and so regex is the incorrect tool to parse it.

Upvotes: 6

Related Questions