Reputation: 31
In Julia I have two dataframes and I want to return a dataframe which selects the rows in the first dataframes that have in the column Fund a fund that appears in the second dataframe. A simple example would be:
df1 = DataFrame(Fund = ["AAA", "AAA", "BBB", "CCC", "DDD"], Purchase = [1000, 500, 600, 800,900])
df2 = DataFrame(Fund = ["AAA", "CCC"], Totals =[1000,200])
and what I would like to return is:
df3 = DataFrame(Fund = ["AAA", "AAA","CCC"], Purchase = [1000, 500, 800])
I have about 10 columns in df1 and a few thousand rows The "Fund" column in df2 will always contain unique funds and will always be subset of df1.Fund and again may contain more than a 1,000 rows
I am new to Julia and have created the function below and was wondering if there was a better way of solving this.
function newtransactions(df1,df2)
res = DataFrame([Any[],Any[]],["Fund", "Purchase"])
for t ∈ df2.Fund
res = append!(res,subset(df1, :Fund => X-> (X .== t)))
end
return res
end
Upvotes: 2
Views: 374
Reputation: 42244
You need to perform an innerjoin
:
julia> innerjoin(df1, df2, on=:Fund)
3×3 DataFrame
Row │ Fund Purchase Totals
│ String Int64 Int64
─────┼──────────────────────────
1 │ AAA 1000 1000
2 │ AAA 500 1000
3 │ CCC 800 200
Note that there is also leftjoin
and rightjoin
if you need to select rather all rows from the first or second table.
Upvotes: 3