Kirill Rostislav
Kirill Rostislav

Reputation: 31

How to nest / unnest data frames in Julia?

Does Julia have any analogues of the nest and unnest functions from the tidyr R package? Particularly, is there a way to make efficient nesting / unnesting operations using DataFrames.jl?

Upvotes: 3

Views: 525

Answers (1)

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42214

Suppose you have the following DataFrame:

julia> d = DataFrame(g=[1,1,1,2,2,3,3,], val1=1:7, val2 = 'a':'g')
7×3 DataFrame
│ Row │ g     │ val1  │ val2 │
│     │ Int64 │ Int64 │ Char │
├─────┼───────┼───────┼──────┤
│ 1   │ 1     │ 1     │ 'a'  │
│ 2   │ 1     │ 2     │ 'b'  │
│ 3   │ 1     │ 3     │ 'c'  │
│ 4   │ 2     │ 4     │ 'd'  │
│ 5   │ 2     │ 5     │ 'e'  │
│ 6   │ 3     │ 6     │ 'f'  │
│ 7   │ 3     │ 7     │ 'g'  │

and assume that you want to sample one element from each group defined by the g column. This can be achieved by:

julia> DataFrame([rand(eachrow(gr)) for gr in groupby(d,:g)])
3×3 DataFrame
│ Row │ g     │ val1  │ val2 │
│     │ Int64 │ Int64 │ Char │
├─────┼───────┼───────┼──────┤
│ 1   │ 1     │ 2     │ 'b'  │
│ 2   │ 2     │ 4     │ 'd'  │
│ 3   │ 3     │ 6     │ 'f'  │

Hope this is what you need.

EDIT

If you want a different element count from each group you could do something like this:

julia> g_to_rows=Dict(1=>4,2=>3,3=>7);   # desired element counts

julia> [ gr[rand(1:nrow(gr),g_to_rows[gr.g[1]]), :] for gr in groupby(d,:g)]
3-element Array{DataFrame,1}:
 4×3 DataFrame
│ Row │ g     │ val1  │ val2 │
│     │ Int64 │ Int64 │ Char │
├─────┼───────┼───────┼──────┤
│ 1   │ 1     │ 1     │ 'a'  │
│ 2   │ 1     │ 1     │ 'a'  │
│ 3   │ 1     │ 3     │ 'c'  │
│ 4   │ 1     │ 2     │ 'b'  │
 3×3 DataFrame
│ Row │ g     │ val1  │ val2 │
│     │ Int64 │ Int64 │ Char │
├─────┼───────┼───────┼──────┤
│ 1   │ 2     │ 5     │ 'e'  │
│ 2   │ 2     │ 5     │ 'e'  │
│ 3   │ 2     │ 5     │ 'e'  │
 7×3 DataFrame
│ Row │ g     │ val1  │ val2 │
│     │ Int64 │ Int64 │ Char │
├─────┼───────┼───────┼──────┤
│ 1   │ 3     │ 7     │ 'g'  │
│ 2   │ 3     │ 6     │ 'f'  │
│ 3   │ 3     │ 6     │ 'f'  │
│ 4   │ 3     │ 7     │ 'g'  │
│ 5   │ 3     │ 7     │ 'g'  │
│ 6   │ 3     │ 6     │ 'f'  │
│ 7   │ 3     │ 6     │ 'f'  │

Upvotes: 2

Related Questions