Antonello
Antonello

Reputation: 6423

How to insert a column in a julia DataFrame at specific position (without referring to existing column names)

I have a DataFrame in Julia with hundreds of columns, and I would like to insert a column after the first one.

For example in this DataFrame:

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)

I would like to insert a column area after colour, but without referring specifically to shape and border (that in my real case are hundreds of different columns).

df[:area] = [1,2]

In this example I can use (but referring specifically to shape and border):

df = df[[:colour, :area, :shape, :border]] # with specific reference to shape and border names

Upvotes: 9

Views: 17299

Answers (3)

Uki D. Lucas
Uki D. Lucas

Reputation: 566

rows = size(df)[1]    # tuple gives you (rows,columns) of the DataFrame

insertcols!(df,       # DataFrame to be changed
    1,                # insert as column 1
    :Day => 1:rows,   # populate as "Day" with 1,2,3,..
    makeunique=true)  # if the name of the column exist, make is Day_1

Upvotes: 6

张实唯
张实唯

Reputation: 2862

Update: This function has changed. See @DiegoJavierZea ’s comment.

Well, congratulate you found a workaround your self, but there is a built-in function that is semantically more clear and possibly a little bit faster:

using DataFrames

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)

insert!(df, 3, [1,2], :area)

Where 3 is the expected index for the new column after the insertion, [1,2] is its content, and :area is the name. You can find a more detailed document by typing ?insert! in REPL after loading the DataFrames package.

It is worth noting that the ! is a part of the function name. It's a Julia convention to indicate that the function will mutate its argument.

Upvotes: 18

Antonello
Antonello

Reputation: 6423

While making the question I also found a solution (as often happens). I still post the question here for keep it in memory (for myself) and for the others..

It is enough to save the column names before "adding" the new column:

df = DataFrame(
  colour = ["green","blue"],
  shape = ["circle", "triangle"],
  border = ["dotted", "line"]
)
dfnames = names(df)
df[:area] = [1,2]

df = df[vcat(dfnames[1:1],:area,dfnames[2:end])]

Upvotes: 0

Related Questions