Reputation: 6423
I have a DataFrame in Julia with hundreds of columns, and I would like to insert a column after the first one.
For example in this DataFrame:
df = DataFrame(
colour = ["green","blue"],
shape = ["circle", "triangle"],
border = ["dotted", "line"]
)
I would like to insert a column area
after colour
, but without referring specifically to shape
and border
(that in my real case are hundreds of different columns).
df[:area] = [1,2]
In this example I can use (but referring specifically to shape
and border
):
df = df[[:colour, :area, :shape, :border]] # with specific reference to shape and border names
Upvotes: 9
Views: 17299
Reputation: 566
rows = size(df)[1] # tuple gives you (rows,columns) of the DataFrame
insertcols!(df, # DataFrame to be changed
1, # insert as column 1
:Day => 1:rows, # populate as "Day" with 1,2,3,..
makeunique=true) # if the name of the column exist, make is Day_1
Upvotes: 6
Reputation: 2862
Update: This function has changed. See @DiegoJavierZea ’s comment.
Well, congratulate you found a workaround your self, but there is a built-in function that is semantically more clear and possibly a little bit faster:
using DataFrames
df = DataFrame(
colour = ["green","blue"],
shape = ["circle", "triangle"],
border = ["dotted", "line"]
)
insert!(df, 3, [1,2], :area)
Where 3
is the expected index for the new column after the insertion, [1,2]
is its content, and :area
is the name. You can find a more detailed document by typing ?insert!
in REPL after loading the DataFrames
package.
It is worth noting that the !
is a part of the function name. It's a Julia convention to indicate that the function will mutate its argument.
Upvotes: 18
Reputation: 6423
While making the question I also found a solution (as often happens). I still post the question here for keep it in memory (for myself) and for the others..
It is enough to save the column names before "adding" the new column:
df = DataFrame(
colour = ["green","blue"],
shape = ["circle", "triangle"],
border = ["dotted", "line"]
)
dfnames = names(df)
df[:area] = [1,2]
df = df[vcat(dfnames[1:1],:area,dfnames[2:end])]
Upvotes: 0