Combining data with overlap

Question

I have two DataFrames:

data = {'First': ['Tom', 'Peter', 'Phil'], 'Last': ['Dwan', 'Laak', 'Ivey'], 
        'Score': [101.5, 99, 105]}
df = pd.DataFrame(data, index=list('abc'))
print df 

   First  Last  Score
a    Tom  Dwan  101.5
b  Peter  Laak   99.0
c   Phil  Ivey  105.0


data2 = {'First': ['Tom', 'Phil'], 'Last': ['Dwan', 'Ivey'], 'Score': [103.5, 101]}
df2 = pd.DataFrame(data2, index=list('fg'))
print df2 

  First  Last  Score
f   Tom  Dwan  103.5
g  Phil  Ivey  101.0

I want to combine them where they overlap, for the net result:

   First  Last  Score  Score_new
a    Tom  Dwan  101.5      103.5
b  Peter  Laak   99.0        NaN
c   Phil  Ivey  105.0      101.0

Since indexes won't match it must join on First and Last columns. Suggestions please?

DSM · Accepted Answer

If you don't care about preserving the indices, you could do something like

>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'))
   First  Last  Score  Score_new
0    Tom  Dwan  101.5      103.5
1  Peter  Laak   99.0        NaN
2   Phil  Ivey  105.0      101.0

[3 rows x 4 columns]

If you do, maybe you could play around with left/right_index, something like

>>> df.merge(df2, on=["First", "Last"], how='outer', suffixes=('', '_new'), right_index=True)
   First  Last  Score  Score_new
a    Tom  Dwan  101.5      103.5
b  Peter  Laak   99.0        NaN
c   Phil  Ivey  105.0      101.0

[3 rows x 4 columns]

but I don't know why those letters would be important.

Combining data with overlap

Answers (1)

Related Questions