Reputation: 1
How can I tighten up my xml to not have the empty lines after I removed some nodes from my xml document?
Upvotes: 0
Views: 657
Reputation: 160551
Nokogiri gives you the ability to fiddle with the text nodes, i.e., the content between nodes:
require 'nokogiri'
doc = Nokogiri::HTML(
'<p>this
<b>text to remove</b>
text
</p>')
doc.at('b').remove
doc.at('p').text = doc.at('p').text.gsub(/\n\s*\n/, "\n")
puts doc.text
The carriage returns embedded in the HTML, generating separate lines in the file, are actually in the intervening text nodes. So, after stripping a tag, you'll end up with whitespace-separated "\n
" characters in the text nodes. A quick gsub
can clean those out.
Upvotes: 1