Hroft
Hroft

Reputation: 4187

Add an attribute to a xml tag in an already existing XML file

I'm parsing a docx file and got an error: "Undefined namespace prefix". To solve this problem I decided to define the namespace, which doesn't exists in the root tag.

To make it I need to insert an "xmlns:wp" attribute with a "(url)" value in the root tag.

How can I do this using the Nokogiri gem?

Or if it is easier with other gem, just show me how. I'm adding the attribute to the XML element using this code:

doc = Nokogiri::XML(File.open(path_to_file)
doc.xpath('w:document').each do |document|
  document.set_attribute('xmlns:wp', 'http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing')
end

and getting the new element with the added attribute. Then I could rewrite the whole file, but maybe there is another way to solve my problem?

Upvotes: 1

Views: 1242

Answers (1)

Phrogz
Phrogz

Reputation: 303244

You need to provide more details on what you are doing that is causing that error. Just parsing an XML doc will not have Nokogiri throw that error; it will instead ignore and discard namespaces not declared:

require 'nokogiri'

# Here's a valid namespace used on an element:
doc = Nokogiri.XML("<root xmlns:a='hi'><a:foo/></root>")
puts doc.root
#=> <root xmlns:a="hi">
#=>   <a:foo/>
#=> </root>

# Here's a namespace that gets ignored
doc = Nokogiri.XML("<root xmlns:a='hi'><zzz:foo/></root>")
puts doc
#=> <root xmlns:a="hi">
#=>   <foo/>
#=> </root>

p doc.at('foo').namespace
#=> nil

# It's OK to declare namespaces later on
doc = Nokogiri.XML("<root><kid xmlns:zzz='yo'><zzz:foo/></kid></root>")
puts doc.root
#=> <root>
#=>   <kid xmlns:zzz="yo">
#=>     <zzz:foo/>
#=>   </kid>
#=> </root>

Parsing an XML doc that uses namespaces that are never declared irrevocably loses them. So even though you can set an attribute on any node like so…

mynode["xmlns:yay"]="someurl"

…it won't help with nodes you already parsed that referenced that namespace name.

Now, perhaps your problem is that you're searching for nodes by namespaces declared later on?

p doc.at_xpath('//zzz:foo')
#=> in `evaluate': Undefined namespace prefix: //zzz:foo (Nokogiri::XML::XPath::SyntaxError)

If so, you have to tell Nokogiri about the namespace:

p doc.at_xpath('//zzz:foo','zzz'=>'yo') 
#=> #<Nokogiri::XML::Element:0x80691894 name="foo" namespace=#<Nokogiri::XML::Namespace:0x806917cc prefix="zzz" href="yo">>

Alternatively, if you're only parsing a document (not to emit it later as XML) and you don't have any name conflicts, you can cheat and just throw out all namespaces for simpler queries:

p doc.at_xpath('//foo')
#=> nil

doc.remove_namespaces!
p doc.at_xpath('//foo')
#=> #<Nokogiri::XML::Element:0x805fa2dc name="foo">

Upvotes: 1

Related Questions