Markus Krogh
Markus Krogh

Reputation: 73

Extracting text values from atom feed with Ruby RSS

I'm trying to use the standard lib ruby RSS::Parser to parse an Atom feed, which sort of works.

When I access the extracted fields, such as .title it returns <title>The title</title> rather than just The title. If you parse e.g. a RSS feed the .channel.title will return The title.

Is there any way to use the standard RSS::Parser for atom feeds? or is it a bug?

I know there are alternatives like Feedzirra, but I would rather use the standard lib.

A quick test to see the problem in ruby 1.9.3 and 2.0:

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s #=> "<title>CasaDelKrogh</title>"

Upvotes: 0

Views: 750

Answers (2)

humbroll
humbroll

Reputation: 347

It's not a bug.

to_s method is almost inspection of RSS::Atom::Feed::Title.

You can use feed.title.content if you want get title without tag.

Upvotes: 2

Arup Rakshit
Arup Rakshit

Reputation: 118271

To get the content of the title your code should be as below :

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s
# => "<title>CasaDelKrogh</title>"
feed.title.content
# => "CasaDelKrogh"

Upvotes: 3

Related Questions