ScalaBoy
ScalaBoy

Reputation: 3392

How to convert rows of DataFrame to a List/Map

I have the following DataFrame df:

id  | type  | count
-------------------
1   | A     | 2 
2   | B     | 4 

I want to pass each row of this df as the input of the function saveObj.

df.foreach( row => {
  val list = List("id" -> row.get(0),"type" -> row.get(1))
  saveObj(list)
})

Inside saveObj I want to access list values as follows: list("id"), list("type").

How can I avoid using indices of columns?: row.get(0)or row.get(1).

Upvotes: 0

Views: 2037

Answers (1)

Tzach Zohar
Tzach Zohar

Reputation: 37852

You can use getAs which expects a column name. By first creating a list of column names you're interested in - you can map them to the desired list of tuples:

// can also use df.columns.toList to get ALL columns
val columns = List("id", "type") 

df.foreach(row => {
  saveObj(columns.map(name => name -> row.getAs[Any](name)))
})

Alternatively, you can take advantage of Row.apply by using pattern matching - but in this case still require knowing the order of columns in the Row and repeating the column names:

df.foreach(_ match {
  case Row(id: Any, typ: Any, _) => saveObj(List("id" -> id, "type" -> typ))
})

Upvotes: 3

Related Questions