PrinceZard
PrinceZard

Reputation: 25

vcat DataFrame columns based on multiple columns in Julia

I have 3 DataFrames, each containing 3 columns: A, B, and C.

using DataFrames    
common_data = Dict("A" => [1, 2, 3], "B" => [10, 20, 30])
df1 = DataFrame(merge(common_data, Dict("C" => [100, 200, 300])))
df2 = DataFrame(merge(common_data, Dict("C" => [400, 500, 600])))
df3 = DataFrame(merge(common_data, Dict("C" => [700, 800, 900])))

I consider columns A and B as indices and want to perform an inner join on 3 DataFrames based on column C. This should be done only when the values in columns A and B of each DataFrame are the same. The column of the final output should be [A,B,C_df1,C_df2,C_df3]. How can I achieve this?

Upvotes: 1

Views: 131

Answers (1)

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69899

Just do innerjoin:

julia> innerjoin(df1, df2, df3, on=[:A, :B], makeunique=true)
3×5 DataFrame
 Row │ A      B      C      C_1    C_2
     │ Int64  Int64  Int64  Int64  Int64
─────┼───────────────────────────────────
   1 │     1     10    100    400    700
   2 │     2     20    200    500    800
   3 │     3     30    300    600    900

it will auto-generate different column names than you want though.

So you can either preprocess your data frames:

julia> innerjoin(rename.([df1, df2, df3], :C .=> [:C_df1, :C_df2, :C_df3])...,
                 on=[:A, :B])
3×5 DataFrame
 Row │ A      B      C_df1  C_df2  C_df3
     │ Int64  Int64  Int64  Int64  Int64
─────┼───────────────────────────────────
   1 │     1     10    100    400    700
   2 │     2     20    200    500    800
   3 │     3     30    300    600    900

or post-process the output:

julia> rename(innerjoin(df1, df2, df3, on=[:A, :B], makeunique=true),
              :C => :C_df1, :C_1 => :C_df2, :C_2 => :C_df3)
3×5 DataFrame
 Row │ A      B      C_df1  C_df2  C_df3
     │ Int64  Int64  Int64  Int64  Int64
─────┼───────────────────────────────────
   1 │     1     10    100    400    700
   2 │     2     20    200    500    800
   3 │     3     30    300    600    900

Upvotes: 1

Related Questions