Reputation: 139
Base on below XML exemple file employees.xml and using Ruby Nokogiri gem I wan to open this file, change the building number to 320 and the room number to 99 for Sandra Defoe and save the changes. What is the recommended way to do it.
<?xml version="1.0" encoding="utf-16"?>
<employees>
<employee id="be129">
<firstname>Jane</firstname>
<lastname>Doe</lastname>
<building>327</building>
<room>19</room>
</employee>
<employee id="be130">
<firstname>William</firstname>
<lastname>Defoe</lastname>
<building>326</building>
<room>14a</room>
</employee>
<employee id="be132">
<firstname>Sandra</firstname>
<lastname>Defoe</lastname>
<building>327</building>
<room>22</room>
</employee>
<employee id="be133">
<firstname>Steve</firstname>
<lastname>Casey</lastname>
<building>327</building>
<room>24</room>
</employee>
</employees>
Upvotes: 0
Views: 136
Reputation: 160551
I'd use this:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="utf-16"?>
<employees>
<employee id="be130">
<firstname>William</firstname>
<lastname>Defoe</lastname>
<building>326</building>
<room>14a</room>
</employee>
<employee id="be132">
<firstname>Sandra</firstname>
<lastname>Defoe</lastname>
<building>327</building>
<room>22</room>
</employee>
</employees>
EOT
first_name = 'Sandra'
last_name = 'Defoe'
node = doc.at("//employee[firstname/text()='%s' and lastname/text()='%s']" % [first_name, last_name])
node.at('building').content = '320'
node.at('room').content = '99'
Which results in:
doc.to_xml
# => "\uFEFF<?xml version=\"1.0\" encoding=\"utf-16\"?>\n" +
# "<employees>\n" +
# " <employee id=\"be130\">\n" +
# " <firstname>William</firstname>\n" +
# " <lastname>Defoe</lastname>\n" +
# " <building>326</building>\n" +
# " <room>14a</room>\n" +
# " </employee>\n" +
# " <employee id=\"be132\">\n" +
# " <firstname>Sandra</firstname>\n" +
# " <lastname>Defoe</lastname>\n" +
# " <building>320</building>\n" +
# " <room>99</room>\n" +
# " </employee>\n" +
# "</employees>\n"
Normally I recommend using CSS selectors because they tend to result in less visual noise, however CSS doesn't let us peek into the text of nodes, and working around that, while possible, results in even more noise. XPath, on the other hand, can be very noisy, but for this sort of task, it's more usable.
XPath is very well documented and figuring out what this is doing should be pretty easy.
The Ruby side of it is using a "format string":
"//employee[firstname/text()='%s' and lastname/text()='%s']" % [first_name, last_name])
similar to
"%s %s" % [first_name, last_name] # => "Sandra Defoe"
"//employee[firstname/text()='%s' and lastname/text()='%s']" % [first_name, last_name]
# => "//employee[firstname/text()='Sandra' and lastname/text()='Defoe']"
Just for thoroughness, here's what I'd do if I wanted to use CSS exclusively:
node = doc.search('employee').find { |node|
node.at('firstname').text == first_name && node.at('lastname').text == last_name
}
This gets ugly though, because search
tells Nokogiri to retrieve all employee
nodes from libXML, then Ruby has to walk through them all telling Nokogiri to tell libXML to look in the child firstname
and lastname
nodes and return their text. That's slow, especially if there are many employee
nodes and the one you want is at the bottom of the file.
The XPath selector tells Nokogiri to pass the search to libXML which parses it, finds the employee
node with the child nodes containing the first and last names and returns only that node. It's much faster.
Note that at('employee')
is equivalent to search('employee').first
.
# File 'lib/nokogiri/xml/searchable.rb', line 70 def at(*args) search(*args).first end
Finally, mediate on the difference between a NodeSet#text and Node#text as the first will lead to insanity.
Upvotes: 2
Reputation: 11216
Assume your content is a string:
xml=%q(
<?xml version="1.0" encoding="utf-16"?>
<employees>
<employee id="be129">
<firstname>Jane</firstname>
<lastname>Doe</lastname>
<building>327</building>
<room>19</room>
</employee>
<employee id="be130">
<firstname>William</firstname>
<lastname>Defoe</lastname>
<building>326</building>
<room>14a</room>
</employee>
<employee id="be132">
<firstname>Sandra</firstname>
<lastname>Defoe</lastname>
<building>327</building>
<room>22</room>
</employee>
<employee id="be133">
<firstname>Steve</firstname>
<lastname>Casey</lastname>
<building>327</building>
<room>24</room>
</employee>
</employees>)
doc = Nokogiri.parse(xml)
This will work but assumes the first and last names are unique, otherwise it will modify the first match of first and last name.
target = doc.css('employee').find do |node|
node.search('firstname').text == 'Sandra' &&
node.search('lastname').text == 'Defoe'
end
target.at_css('building').content = '320'
target.at_css('room').content = '99'
doc # outputs the updated xml
=> <?xml version="1.0"?>
<?xml version="1.0" encoding="utf-16"?>
<employees>
<employee id="be129">
<firstname>Jane</firstname>
<lastname>Doe</lastname>
<building>327</building>
<room>19</room>
</employee>
<employee id="be130">
<firstname>William</firstname>
<lastname>Defoe</lastname>
<building>326</building>
<room>14a</room>
</employee>
<employee id="be132">
<firstname>Sandra</firstname>
<lastname>Defoe</lastname>
<building>320</building>
<room>99</room>
</employee>
<employee id="be133">
<firstname>Steve</firstname>
<lastname>Casey</lastname>
<building>327</building>
<room>24</room>
</employee>
</employees>
Upvotes: 1