Reputation: 29567
The javadocs for Spark's StructType#add
method show that the second argument needs to be a class that extends DataType
.
I have a situation where I need to add a fairly complicated MapType
as a field on a StructType
.
Specifically, this MapType
field is a map of several nested structures:
Map<String,Map<Integer,Map<String,String>>>
Hence it is a map with 2 nested/inner maps. The inner-most map if of type Map<String,String>
(so in Spark parlance, MapType[StringType,StringType]
).
The middle map is of type Map<Integer,Map<String,String>>
(so again in Spark parlance, MapType[IntegerType,MapType[StringType,StringType]]
).
How do I specify this complex nested structure of maps when calling the StructType#add
method?
That is, I want to be able to do something like this:
var myStruct : StructType = new StructType()
myStruct.add("complex-o-map",
MapType[StringType,MapType[IntegerType,MapType[StringType,StringType]]])
However it only looks like I can add the single outer-most MapType
:
var myStruct : StructType = new StructType()
myStruct.add("complex-o-map", MapType)
This makes me sad. How do I specify my nested map structure during the call to add(...)
?
Upvotes: 0
Views: 1170
Reputation: 37852
The "types" expected by MapType
(.e.g StringTypes
, MapType
) aren't really types in the Scala sense, they are objects, so you should pass them as constructor arguments and not as type parameters - in other words, use ()
instead of []
:
val myStruct = new StructType().add("complex-o-map",
MapType(StringType,MapType(IntegerType,MapType(StringType,StringType))))
myStruct.printTreeString()
// prints:
// root
// |-- complex-o-map: map (nullable = true)
// | |-- key: string
// | |-- value: map (valueContainsNull = true)
// | | |-- key: integer
// | | |-- value: map (valueContainsNull = true)
// | | | |-- key: string
// | | | |-- value: string (valueContainsNull = true)
Upvotes: 2