Reputation: 117
I have two df. One maps values to IDs. The other one has multiple entries of these IDs. I want to have a df with the first dataframe with the values assigned to the respective IDs.
df1 =
Val1 Val2 Val3
x 1000 2 0
y 2000 3 9
z 3000 1 8
df2=
foo ID bar
0 something y a
1 nothing y b
2 everything x c
3 who z d
result=
foo ID bar Val1 Val2 Val3
0 something y a 2000 3 9
1 nothing y b 2000 3 9
2 everything x c 1000 2 0
3 who z d 3000 1 8
I've tried merge and join (obviously incorrectly) but I am getting a bunch of NaNs when I do that. It appears that I am getting NaNs on every alternate ID. I have also tried indexing both DFs by ID but that didn't seem to help either. I am obviously missing something that I am guessing is a core functionality but I can't get my head around it.
Upvotes: 1
Views: 91
Reputation: 17570
merge
and join
could both get you the result DataFrame you want. Since one of your DataFrames is indexed (by ID) and the other has just a integer index, merge
is the logical choice.
Merge:
# use ID as the column to join on in df2 and the index of df1
result = df2.merge(df1, left_on="ID", right_index=True, how="inner")
Join:
df2.set_index("ID", inplace=True) # index df2 in place so you can use join, which merges by index by default
result = df2.join(df1, how="inner") # join df1 by index
Upvotes: 3