Jon Barker
Jon Barker

Reputation: 1829

Combine Julia Dataframes by reference, instead of making copy

In Julia you can combine dataframes:

d1 = DataFrame(A=1:10)
d2 = DataFrame(A=11:20)

d3 = [d1; d2]

However this appears to copy d1, d2 into d3. I don't want to copy them. If you make a modification to d1, it is not reflected in d3.

Anyone know how to combine them by reference instead of by value, so that if d1 is modified, the change reflects in d3?

Thanks!

Upvotes: 2

Views: 250

Answers (1)

Dan Getz
Dan Getz

Reputation: 18227

In the Array type terminology, what you want is d1 and d2 to be views to the data in d3. This is also possible with DataFrames:

julia> using DataFrames

julia> d3 = DataFrame(A=1:20);

julia> d1 = view(d3,1:10);

julia> d2 = view(d3,11:20);

julia> d1[1:3,:]
3×1 DataFrames.DataFrame
│ Row │ A │
├─────┼───┤
│ 1   │ 1 │
│ 2   │ 2 │
│ 3   │ 3 │

julia> d3[1:3,:]
3×1 DataFrames.DataFrame
│ Row │ A │
├─────┼───┤
│ 1   │ 1 │
│ 2   │ 2 │
│ 3   │ 3 │

julia> d1[1,:A] = 999
999

julia> d3[1:3,:]
3×1 DataFrames.DataFrame
│ Row │ A   │
├─────┼─────┤
│ 1   │ 999 │
│ 2   │ 2   │
│ 3   │ 3   │

Of course, you may want to create d1 and d2 first, and then combine them to d3, but this would require a copy operation (to make the columns contiguous in memory). After that, you can generate the views (and assign them to d1 and d2). Using different variables for the views might be recommended as changing the type of d1 and d2 might cause type-instability (bad in Julia).

Upvotes: 3

Related Questions