A.Yazdiha
A.Yazdiha

Reputation: 1378

Julia dataframes with specific column types

I want to make a DataFrame of say size N*K, and I want some columns to be Float64, and other columns to be Int64. Is there a specific way of defining the DataFrame that allows me to do that?

This is my current approach:

df = convert(DataFrame, zeros(Float64, (N, K)))
df[:,K-2] = convert(Array{Int64,1}, df[:,K-2])
df[:,K-1] = convert(Array{Int64,1}, df[:,K-1])

Upvotes: 2

Views: 1526

Answers (1)

Fengyang Wang
Fengyang Wang

Reputation: 12051

You could concatenate two DataFrames:

julia> hcat(DataFrame(Float64, 3, 5), DataFrame(Int64, 3, 3))
3×8 DataFrames.DataFrame
│ Row │ x1 │ x2 │ x3 │ x4 │ x5 │ x1_1 │ x2_1 │ x3_1 │
├─────┼────┼────┼────┼────┼────┼──────┼──────┼──────┤
│ 1   │ NA │ NA │ NA │ NA │ NA │ NA   │ NA   │ NA   │
│ 2   │ NA │ NA │ NA │ NA │ NA │ NA   │ NA   │ NA   │
│ 3   │ NA │ NA │ NA │ NA │ NA │ NA   │ NA   │ NA   │

The DataFrame constructor also takes a vector of types as an argument:

julia> DataFrame([Float64, Float64, Int64, Int64], [Symbol("x$i") for i in 1:4], 3)
3×4 DataFrames.DataFrame
│ Row │ x1 │ x2 │ x3 │ x4 │
├─────┼────┼────┼────┼────┤
│ 1   │ NA │ NA │ NA │ NA │
│ 2   │ NA │ NA │ NA │ NA │
│ 3   │ NA │ NA │ NA │ NA │

You can construct the appropriate vector of types using concatenation:

julia> [repeat([Float64]; outer=4); repeat([Int64]; outer=2)]
6-element Array{DataType,1}:
 Float64
 Float64
 Float64
 Float64
 Int64  
 Int64  

Upvotes: 4

Related Questions