Reputation: 4187
I'm parsing a docx file and got an error: "Undefined namespace prefix". To solve this problem I decided to define the namespace, which doesn't exists in the root tag.
To make it I need to insert an "xmlns:wp" attribute with a "(url)" value in the root tag.
How can I do this using the Nokogiri gem?
Or if it is easier with other gem, just show me how. I'm adding the attribute to the XML element using this code:
doc = Nokogiri::XML(File.open(path_to_file)
doc.xpath('w:document').each do |document|
document.set_attribute('xmlns:wp', 'http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing')
end
and getting the new element with the added attribute. Then I could rewrite the whole file, but maybe there is another way to solve my problem?
Upvotes: 1
Views: 1242
Reputation: 303244
You need to provide more details on what you are doing that is causing that error. Just parsing an XML doc will not have Nokogiri throw that error; it will instead ignore and discard namespaces not declared:
require 'nokogiri'
# Here's a valid namespace used on an element:
doc = Nokogiri.XML("<root xmlns:a='hi'><a:foo/></root>")
puts doc.root
#=> <root xmlns:a="hi">
#=> <a:foo/>
#=> </root>
# Here's a namespace that gets ignored
doc = Nokogiri.XML("<root xmlns:a='hi'><zzz:foo/></root>")
puts doc
#=> <root xmlns:a="hi">
#=> <foo/>
#=> </root>
p doc.at('foo').namespace
#=> nil
# It's OK to declare namespaces later on
doc = Nokogiri.XML("<root><kid xmlns:zzz='yo'><zzz:foo/></kid></root>")
puts doc.root
#=> <root>
#=> <kid xmlns:zzz="yo">
#=> <zzz:foo/>
#=> </kid>
#=> </root>
Parsing an XML doc that uses namespaces that are never declared irrevocably loses them. So even though you can set an attribute on any node like so…
mynode["xmlns:yay"]="someurl"
…it won't help with nodes you already parsed that referenced that namespace name.
Now, perhaps your problem is that you're searching for nodes by namespaces declared later on?
p doc.at_xpath('//zzz:foo')
#=> in `evaluate': Undefined namespace prefix: //zzz:foo (Nokogiri::XML::XPath::SyntaxError)
If so, you have to tell Nokogiri about the namespace:
p doc.at_xpath('//zzz:foo','zzz'=>'yo')
#=> #<Nokogiri::XML::Element:0x80691894 name="foo" namespace=#<Nokogiri::XML::Namespace:0x806917cc prefix="zzz" href="yo">>
Alternatively, if you're only parsing a document (not to emit it later as XML) and you don't have any name conflicts, you can cheat and just throw out all namespaces for simpler queries:
p doc.at_xpath('//foo')
#=> nil
doc.remove_namespaces!
p doc.at_xpath('//foo')
#=> #<Nokogiri::XML::Element:0x805fa2dc name="foo">
Upvotes: 1