Reputation: 818
I wanted to create a nested XML from CSV/DataFrame in scala spark. I am using Databricks spark-XML library for converting the DataFrame to XML format.
I was trying to create an output like below, but unable to achieve it
<rows>
<row>
<name id=10>Mahashree</name>
</row>
</rows>
I have tried with struct
{"_VALUE":"Mahashree","_id":10}
but resulted as below
<rows>
<row>
<name id=10 VALUE="Mahashree"></name>
</row>
</rows>
In DataBricks Documentation they have documentation for converting the nested XML but not to nested XML.
<one>
<two myTwoAttrib="BBBBB">two</two>
<three>three</three>
</one>
produces a schema below:
root
|-- two: struct (nullable = true)
| |-- _VALUE: string (nullable = true)
| |-- _myTwoAttrib: string (nullable = true)
|-- three: string (nullable = true)
can anyone help to the nested element with attributes?
Thanks in Advance
Upvotes: 0
Views: 1911
Reputation: 7207
Can be achieved with two options "attributePrefix" and "valueTag" described here: https://github.com/databricks/spark-xml
For example, all must be fine if add to stuct additional underscore to "id":
{"_VALUE":"Mahashree","__id":10}
And save with such options:
.option("attributePrefix", "__")
.option("valueTag", "_VALUE")
Upvotes: 4