Reputation: 11
From the following, I'd like to grab the text from the node containing the xml:lang="en"
attribute.
<li><span class="literal"><span property="dbpedia-owl:abstract" xmlns:dbpedia-owl="http://dbpedia.org/ontology/" xml:lang="en">text</span></span></li>
Currently I'm using:
ns = {"xmlns" => "http://www.w3.org/1999/xhtml"}
ns = {"xml" => "http://www.w3.org/XML/1998/namespace"}`
array << doc.xpath("//span[@property='dbpedia-owl:abstract' and xmlns:dbpedia-owl='http://dbpedia.org/ontology/' and @xml:lang='en']").text`
I'm not sure if it's my XPath array or namespace declaration that is wrong, but either way I'm not doing something right.
Apologies if the question has been asked previously, but I couldn't find something combining namespaces and multiple attributes, so I might just be combining solutions that I found to those separate problems improperly. It might also be an issue with the xmlsn:dbpedia-owl
value being a URL, but again, not sure.
Upvotes: 0
Views: 391
Reputation: 160631
I'm not at my computer, so I can't test this, but I'd start with something like:
doc.at('span.literal').text
Namespaces are useful, but according to your sample you should be able to grab the text easily.
Upvotes: 1