Reputation: 3392
I have the following DataFrame df
:
id | type | count
-------------------
1 | A | 2
2 | B | 4
I want to pass each row of this df
as the input of the function saveObj
.
df.foreach( row => {
val list = List("id" -> row.get(0),"type" -> row.get(1))
saveObj(list)
})
Inside saveObj
I want to access list
values as follows: list("id")
, list("type")
.
How can I avoid using indices of columns?: row.get(0)
or row.get(1)
.
Upvotes: 0
Views: 2037
Reputation: 37852
You can use getAs
which expects a column name. By first creating a list of column names you're interested in - you can map them to the desired list of tuples:
// can also use df.columns.toList to get ALL columns
val columns = List("id", "type")
df.foreach(row => {
saveObj(columns.map(name => name -> row.getAs[Any](name)))
})
Alternatively, you can take advantage of Row.apply
by using pattern matching - but in this case still require knowing the order of columns in the Row
and repeating the column names:
df.foreach(_ match {
case Row(id: Any, typ: Any, _) => saveObj(List("id" -> id, "type" -> typ))
})
Upvotes: 3