Reputation: 6109
I am actually working on windows and I have to parse xml from a file.
The issue is when i parse the root element, and get the children via the child
method, I am getting empty children.
XML.load("my_path\\sof.xml").child
res0: Seq[scala.xml.Node] = List(
, <b/>,
)
This is my xml file
sof.xml
<a>
<b></b>
</a>
But when I remove every \n and \r of the file like this :
sof.xml
<a><b></b></a>
I got the following result which is expected
res0: Seq[scala.xml.Node] = List(<b/>)
My question is, is there an option to read it correctly from the intended form?
Upvotes: 1
Views: 342
Reputation: 480
The issue is the newlines/whitespace are treated as Text nodes. The scala.xml.Utility.trim(x: Node)
method will remove the unnecessary whitespace:
scala> val a = XML.loadString("""<a>
| <b></b>
| </a>""")
a: scala.xml.Elem =
<a>
<b/>
</a>
scala> scala.xml.Utility.trim(a)
res0: scala.xml.Node = <a><b/></a>
Note that this differs from the .collect method if you have actual Text nodes inbetween elements, e.g.:
scala> val a = XML.loadString("""<a>
| <b>Test </b> Foo
| </a>""")
a: scala.xml.Elem =
<a>
<b>Test </b> Foo
</a>
scala> scala.xml.Utility.trim(a).child
res0: Seq[scala.xml.Node] = List(<b>Test</b>, Test)
scala> a.child.collect { case e: scala.xml.Elem => e }
res1: Seq[scala.xml.Elem] = List(<b>Test </b>)
Using .collect method, the "Foo" string is excluded from the children list.
Upvotes: 2
Reputation: 14803
I checked that with this on Mac:
XML.loadString("""<a>
| <b></b>
|</a>""").child
This results in the same behavior - which I also not understand.
However this can fix this in your code:
XML.loadString("""<a>
| <b></b>
|</a>""").child
.collect{ case e: Elem=> e}
This will eliminate the xml.Text
s.
Upvotes: 1