Steve S
Steve S

Reputation: 3

Apparent issues with DataFrame string values

I am not sure if this is an actual problem or if I am just not doing something the correct way, but at the moment it appears a little bizarre to me.

When using DataFrames I came across an issue where if you copy a DataFrame to another variable, then any changes made to either of the variables changes both. This goes for the individual columns too. For example:

julia> x = DataFrame(A = ["pink", "blue", "green"], B = ["yellow", "red", "purple"]);
julia> y = x;
julia> x[x.A .== "blue", :A] = "red";
julia> x
3×2 DataFrame
│ Row │ A     │ B      │
├─────┼───────┼────────┤
│ 1   │ pink  │ yellow │
│ 2   │ red   │ red    │
│ 3   │ green │ purple │

julia> y
3×2 DataFrame
│ Row │ A     │ B      │
├─────┼───────┼────────┤
│ 1   │ pink  │ yellow │
│ 2   │ red   │ red    │
│ 3   │ green │ purple │

A similar thing happens with columns too, so if were to say setup a DataFrame like the above but use B = A before I incorporate both into a data frame, then if the values in one column is changed, the other is also automatically changed.

This seems odd to me, and maybe it is a feature of other programming languages but I have done the same thing as above in R many times when making a backup of a data table or swapping data between columns, and have never seen this issue. So the question is, is it working as designed and is there a correct way for copying values between data frames?

I am using Julia version 0.7.0 since I originally installed 1.0.0 through the Manjaro repository and had issues with the Is_windows() when trying to build Tk.

Upvotes: 0

Views: 51

Answers (1)

Andy
Andy

Reputation: 450

The command y = x does not create a new object; it just creates a new reference (or name) for the same DataFrame.

You can create a copy by calling y = copy(x). In your case, this still doesn't work, as it only copies the dataframe itself but not the variables in it.

If you want a completely independent new object, you can use y = deepcopy(x). In this case, y will have no references to x.

See this thread for a more detailed discussion:

https://discourse.julialang.org/t/what-is-the-difference-between-copy-and-deepcopy/3918/2

Upvotes: 1

Related Questions