Philip K. Adetiloye
Philip K. Adetiloye

Reputation: 3270

Scala - merge a list to map

I need to merge a list into a set from an RDD, but I got stuck doing it in Scala:

var accounts = set("name" -> "", "id" -> 0, ....)

//Split the RDD into lines and split each line by `|` to get the values
stream.foreachRDD {_.map(_._2).flatMap(_.split("|")).foreach(f => /*merge here ?*/)}

How do I associate the values with my account sets?

For example, assume a RDD loaded from a CSV (I made up this data)

 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 Donald|Trump|US|Election|March|Spring|Rubio|Ted Cruz|Ben Carson|Primary|Winner|...
 ...

The RDD has up to 300 columns/fields.

My main objective is to convert it to some json but I need to associate each value to a key by loading it up to map or class.

var election = Map ("firstname" -> "Donald",
"lastname" -> "Trump",
"country" -> "US",
"event" -> "Election",
"period" -> "March"
"var1" -> "Spring",
 ....
"varN" -> "...")

Upvotes: 2

Views: 206

Answers (2)

Philip K. Adetiloye
Philip K. Adetiloye

Reputation: 3270

A bit of clean up to @slouc answer

stream.foreachRDD {_.map(_._2).map(l => (mapKeys zip l.split("\\|")).toMap).saveToEs(conf)}

Upvotes: 0

slouc
slouc

Reputation: 9698

I'm not sure if I understood correctly, but does this help?

val data = List(
  "Donald|Trump|US|Election|March",
  "John|Smith|UK|Election|February"
)

val mapKeys = List("firstname", "lastname", "country", "event", "period")

val election = data.map { row =>
  (mapKeys zip row.split("\\|").toList).map {
    case (key, value) => key -> value
  }.toMap
}

So, you will get a list of maps - for each row of your data you get a map of key/value pairs as you described.

Upvotes: 1

Related Questions