Horse Voice
Horse Voice

Reputation: 8338

How to parse a Nokogiri XML Element?

I'm able to narrow in on the area of an HTML document using nokogiri. I need to be able to extract the href from the nokogiri object but I'm not able to figureout how to do this for the life of me. Calling row.css('td > b').to_html method gives me the pretty html representation in string form. But I need to parse this using nokogiri.

"<b><a href=\"/ShowTopic-g293766-i9284-k10224928-Tour_companies_for_botswana-Botswana.html\" onclick=\"setPID(34603)\">\ntour companies for botswana</a></b>"

The nokogiri equivalent that I'm unable to extract the url from is below:

[#<Nokogiri::XML::Element:0x3fe972a9deb8 name="b" children=[#<Nokogiri::XML::Element:0x3fe972ad90a8 name="a" attributes=[#<Nokogiri::XML::Attr:0x3fe972ad8ff4 name="href" value="/ShowTopic-g317055-i11941-k10224606-United_Expeditions_tour_company_Maun-Maun_North_West_District.html">, #<Nokogiri::XML::Attr:0x3fe972ad8fe0 name="onclick" value="setPID(34603)">] children=[#<Nokogiri::XML::Text:0x3fe972ad8900 "\nUnited Expeditions tour company, Maun">]>]>]

The snippet above is a confusing bit of nokogiri xml object I guess. But I just want to get the href. How the heck do I do this?

Upvotes: 2

Views: 2069

Answers (1)

XYZ
XYZ

Reputation: 27387

row.css('td > b a').attr('href')

This should do the work. Read more about How to access attributes using Nokogiri.

Upvotes: 3

Related Questions