BogStandard
BogStandard

Reputation: 2342

Preserving whitespace / line breaks with REXML

I'm using Ruby 1.9.3 and REXML to parse an XML document, make a few changes (additions/subtractions), then re-output the file. Within this file is a block that looks like this:

<someElement>
  some.namespace.something1=somevalue1
  some.namespace.something2=somevalue2
  some.namespace.something3=somevalue3
</someElement>

The problem is that after re-writing the file, this block always ends up looking like this:

<someElement>
  some.namespace.something1=somevalue1
  some.namespace.something2=somevalue2 some.namespace.something3=somevalue3
</someElement>

The newline after the second value (but never the first!) has been lost and turned into a space. Later, some other code which I have no control or influence over will be reading this file and depending on those newlines to properly parse the content. Generally in this situation i'd use a CDATA to preserve the whitespace, but this isn't an option as the code that parses this data later is not expecting one - it's essential that the inner text of this element is preserved exactly as-is.

My read/write code looks like this:

xmlFile = File.open(myFile)
contents = xmlFile.read
xmlDoc = REXML::Document.new(contents, { :respect_whitespace => :all })
xmlFile.close

{perform some tasks}

out = ""
xmlDoc.write(out, 2)
File.open(filePath, "w"){|file| file.puts(out)}

I'm looking for a way to preserve the whitespace of text between elements when reading/writing a file in this manner using REXML. I've read a number of other questions here on stackoverflow on this subject, but none that quite replicate this scenario. Any ideas or suggestions are welcome.

Upvotes: 0

Views: 1218

Answers (1)

Darshan Rivka Whittle
Darshan Rivka Whittle

Reputation: 34041

I get correct behavior by removing the indent (second) parameter to Document.write():

#xmlDoc.write(out, 2)
xmlDoc.write(out)

That seems like a bug in Document.write() according to my reading of the docs, but if you don't really need to set the indentation, then leaving that off should solve yor problem.

Upvotes: 1

Related Questions