Markus
Markus

Reputation: 3782

How to create a DataFrame from List?

I want to create DataFrame df that should look as simple as this:

+----------+----------+
| timestamp|      col2|
+----------+----------+
|2018-01-11|       123|
+----------+----------+

This is what I do:

val values = List(List("timestamp", "2018-01-11"),List("col2","123")).map(x =>(x(0), x(1)))    
val df = values.toDF    
df.show()

And this is what I get:

+---------+----------+
|       _1|        _2|
+---------+----------+
|timestamp|2018-01-11|
|     col2|       123|
+---------+----------+

What's wrong here?

Upvotes: 0

Views: 7048

Answers (3)

Andrey Tyukin
Andrey Tyukin

Reputation: 44908

EDIT (sorry, I missed that you had the headers glued to each column). Maybe something like this could work:

val values = List(
  List("timestamp", "2018-01-11"),
  List("col2","123")
)

val heads = values.map(_.head) // extracts headers of columns
val cols = values.map(_.tail) // extracts columns without headers
val rows = cols(0).zip(cols(1)) // zips two columns into list of rows
rows.toDF(heads: _*)

This would work if the "values" contained two longer lists, but it does not generalize to more lists.

Upvotes: 1

Denis Iakunchikov
Denis Iakunchikov

Reputation: 1349

If you don't know the names of the columns statically you can use following syntax sugar

.toDF( columnNames: _*)

Where columnNames is the List with the names.

Upvotes: 1

Tzach Zohar
Tzach Zohar

Reputation: 37822

Use

val df = List(("2018-01-11", "123")).toDF("timestamp", "col2")
  • toDF expects the input list to contain one entry per resulting Row
  • Each such entry should be a case class or a tuple
  • It does not expect column "headers" in the data itself (to name columns - pass names as arguments of toDF)

Upvotes: 5

Related Questions