Frank
Frank

Reputation: 1037

how to create a map with a dataframe

I have a dataframe, df.show() like this:

+-----------+-------------------+
|id|                        name|
+-----------+-------------------+
|       1231|                aa |
|       1232|                bb |
|       1233|                cc |
|       1234|                dd |
|       1235|                 dd|
|       1236|                 cc|
+-----------+-------------------+

the column "id" is unique, now I would create a map which key is "id",value is "name",how to realize it by scala? assume dataframe name is df.

val mapResult = df.map(...)

Upvotes: 1

Views: 6217

Answers (1)

koiralo
koiralo

Reputation: 23119

You can simply convert to rdd and use collectAsMap

df.rdd.map(x => (x.getInt(0), x.getString(1))).collectAsMap()

This will give you

scala>  df.rdd.map(x => (x.getInt(0), x.getString(1))).collectAsMap()
res0: scala.collection.Map[Int,String] = Map(1231 -> aa, 1234 -> dd, 1236 -> cc, 1233 -> cc, 1232 -> bb, 1235 -> dd)

collectAsMap is only recommended when your data fits in a driver.

Hope this helps!

Upvotes: 6

Related Questions