Reputation: 1455
I need to define a custom schema for the XML below. Each TABLE attribute has different columns so I want to define a different custom schema for each attribute.
<CASE>
<TABLE attrname="Wood">
<ROWDATA>
<ROW Weight="55" Length="11" color="Black"/>
</ROWDATA>
</TABLE>
<TABLE attrname="Metal">
<ROWDATA>
<ROW Type ="AL" Weight="66" Length="23" Unit="0" />
<ROW Type ="AL" Weight="44" Length="22" Unit="0"/>
<ROW Type ="AL" Weight="33" Length="21" Unit="1"/>
</ROWDATA>
<TABLE attrname="Plastic">
<ROWDATA>
<ROW color="Blue" Grade="A" Price="112"/>
</ROWDATA>
</TABLE>
<CASE>
This can be used to read the XML, but is there a way to read it after checking for the attribute name? For example if the table attribute name is "Plastic" then I want to use the following schema to read the XML.
val xmlDFF = session.read
.option("rootTag", "CASE")
.option("rowTag", "TABLE")
.schema(getPlasticSchema)
.xml(filePath)
def getPlasticSchema: StructType = {
val rowType = new StructType()
.add("_color", StringType)
.add("_Grade", StringType)
.add("_Price", StringType)
val rowDataType = new StructType()
.add("ROW", ArrayType(rowType))
val tableTypePlastic = new StructType()
.add("_attrname", StringType)
.add("ROWDATA", rowDataType)
tableTypePlastic
}
Upvotes: 0
Views: 36