Morpheu5
Morpheu5

Reputation: 2801

How do I convert an Array of Arrays to a DataFrame, in Julia?

I have a map call that results in a row of computed values, therefore I have an Array of rows that are Array of Any, like this

12-element Array{Array{Any,1},1}:
 Any[2015-09-01T00:00:00, 2016-09-01T00:00:00, 98, 53.1] 
 Any[2015-10-01T00:00:00, 2016-10-01T00:00:00, 92, 58.7] 
 Any[2015-11-01T00:00:00, 2016-11-01T00:00:00, 130, 64.6]
 Any[2015-12-01T00:00:00, 2016-12-01T00:00:00, 135, 67.4]
 Any[2016-01-01T00:00:00, 2017-01-01T00:00:00, 206, 59.2]
 Any[2016-02-01T00:00:00, 2017-02-01T00:00:00, 246, 54.1]
 Any[2016-03-01T00:00:00, 2017-03-01T00:00:00, 254, 53.9]
 Any[2016-04-01T00:00:00, 2017-04-01T00:00:00, 268, 65.7]
 Any[2016-05-01T00:00:00, 2017-05-01T00:00:00, 265, 61.5]
 Any[2016-06-01T00:00:00, 2017-06-01T00:00:00, 303, 52.8]
 Any[2016-07-01T00:00:00, 2017-07-01T00:00:00, 301, 59.1]
 Any[2016-08-01T00:00:00, 2017-08-01T00:00:00, 273, 54.6]

Is there an easy way of turning this into a DataFrame, with column names and so on? If there isn't an easy way, I'm open to harder ways :) I can think of having to re-run map four times to extract the columns and build the DataFrame from those, but that sounds like a lot of code for such a seemingly mundane operation…

EDIT I can "transpose" rows to columns like this

map(x -> map(y -> y[x], r), collect(1:4)

where r is the table above, so I suppose a solution would be to provide column names to the DataFrame constructor. My temporary solution is therefore

DataFrame(map(x -> map(y -> y[x], r), collect(1:4)), [:a, :b, :c, :d])

Upvotes: 3

Views: 633

Answers (1)

daycaster
daycaster

Reputation: 2707

julia> df
12-element Array{Array{Any,1},1}:
 Any["2015-09-01T00:00:00", "2016-09-01T00:00:00", 98, 53.1] 
 Any["2015-10-01T00:00:00", "2016-10-01T00:00:00", 92, 58.7] 
 Any["2015-11-01T00:00:00", "2016-11-01T00:00:00", 130, 64.6]
 Any["2015-12-01T00:00:00", "2016-12-01T00:00:00", 135, 67.4]
 Any["2016-01-01T00:00:00", "2017-01-01T00:00:00", 206, 59.2]
 Any["2016-02-01T00:00:00", "2017-02-01T00:00:00", 246, 54.1]
 Any["2016-03-01T00:00:00", "2017-03-01T00:00:00", 254, 53.9]
 Any["2016-04-01T00:00:00", "2017-04-01T00:00:00", 268, 65.7]
 Any["2016-05-01T00:00:00", "2017-05-01T00:00:00", 265, 61.5]
 Any["2016-06-01T00:00:00", "2017-06-01T00:00:00", 303, 52.8]
 Any["2016-07-01T00:00:00", "2017-07-01T00:00:00", 301, 59.1]
 Any["2016-08-01T00:00:00", "2017-08-01T00:00:00", 273, 54.6]

julia> DataFrame(permutedims(Array(DataFrame(map(data,df))), [2, 1]))
12×4 DataFrames.DataFrame
│ Row │ x1                    │ x2                    │ x3  │ x4   │
├─────┼───────────────────────┼───────────────────────┼─────┼──────┤
│ 1   │ "2015-09-01T00:00:00" │ "2016-09-01T00:00:00" │ 98  │ 53.1 │
│ 2   │ "2015-10-01T00:00:00" │ "2016-10-01T00:00:00" │ 92  │ 58.7 │
│ 3   │ "2015-11-01T00:00:00" │ "2016-11-01T00:00:00" │ 130 │ 64.6 │
│ 4   │ "2015-12-01T00:00:00" │ "2016-12-01T00:00:00" │ 135 │ 67.4 │
│ 5   │ "2016-01-01T00:00:00" │ "2017-01-01T00:00:00" │ 206 │ 59.2 │
│ 6   │ "2016-02-01T00:00:00" │ "2017-02-01T00:00:00" │ 246 │ 54.1 │
│ 7   │ "2016-03-01T00:00:00" │ "2017-03-01T00:00:00" │ 254 │ 53.9 │
│ 8   │ "2016-04-01T00:00:00" │ "2017-04-01T00:00:00" │ 268 │ 65.7 │
│ 9   │ "2016-05-01T00:00:00" │ "2017-05-01T00:00:00" │ 265 │ 61.5 │
│ 10  │ "2016-06-01T00:00:00" │ "2017-06-01T00:00:00" │ 303 │ 52.8 │
│ 11  │ "2016-07-01T00:00:00" │ "2017-07-01T00:00:00" │ 301 │ 59.1 │
│ 12  │ "2016-08-01T00:00:00" │ "2017-08-01T00:00:00" │ 273 │ 54.6 │

I think your solution is much better... !

Upvotes: 2

Related Questions