Reputation: 23
I am currently using Spark and Scala 2.11.8
I have the following schema:
root
|-- partnumber: string (nullable = true)
|-- brandlabel: string (nullable = true)
|-- availabledate: string (nullable = true)
|-- descriptions: array (nullable = true)
|-- |-- element: string (containsNull = true)
I am trying to use UDF to convert it to the following:
root
|-- partnumber: string (nullable = true)
|-- brandlabel: string (nullable = true)
|-- availabledate: string (nullable = true)
|-- description: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- value: string (nullable = true)
| | |-- code: string (nullable = true)
| | |-- cost: int(nullable = true)
So source data looks like this:
[WrappedArray(a abc 100,b abc 300)]
[WrappedArray(c abc 400)]
I need to use " " (space) as a delimiter, but don't know how to do this in scala.
def convert(product: Seq[String]): Seq[Row] = {
??/
}
I am fairly new in Scala, so can someone guide me how to construct this type of function?
Thanks.
Upvotes: 0
Views: 3838
Reputation: 4044
I do not know if I understand your problem right, but map could be your friend.
case class Row(a: String, b: String, c: Int)
val value = List(List("a", "abc", 123), List("b", "bcd", 321))
value map {
case List(a: String, b: String, c: Int) => Row(a,b,c);
}
if you have to parse it first:
val value2 = List("a b 123", "c d 345")
value2 map {
case s => {
val split = s.toString.split(" ")
Row(split(0), split(1), split(2).toInt)
}
}
Upvotes: 2