Stefano Potter
Stefano Potter

Reputation: 3577

Merging dataframes of different length

I have two data frames which look like this,

df1:

SP1          Count1
ANDGER        2
CARCRA        4
CAR           3
PANVIR        1

and df2:

SP2          Count2
CARCRA        7
CAR           6
PANVIR        4

I want to merge them together based on matching id's in the first column (SP1 in df1 and SP2 in df2) but I want to retain the id's that are only in one of the dataframes still.

My desired output is:

Species     Count1  Count2
ANDGER        2       NaN
CARCRA        4        7
CAR           3        6
PANVIR        1        4

I have tried:

df1.set_index('SP1')
df2.set_index('SP2')
pd.merge(df1, df2,left_index=True, right_index=True)

but it doesn't retain items that don't match.

Edit:

For what its worth the code that worked for me is:

pd.merge(df1,df2,left_index=True, right_index=True, how='outer')

Upvotes: 0

Views: 69

Answers (2)

Kathirmani Sukumar
Kathirmani Sukumar

Reputation: 10970

You can use pd.merge as follows

print pd.merge(df1,df2, left_on='SP1', right_on='SP2', how='outer')
      SP1  Count1     SP2  Count2
0  ANDGER       2     NaN     NaN
1  CARCRA       4  CARCRA       7
2     CAR       3     CAR       6
3  PANVIR       1  PANVIR       4

Better groupby if you have duplicate entries in the columns using which you are merging.

Upvotes: 1

Jianxun Li
Jianxun Li

Reputation: 24742

You can use pd.concat().

import pandas as pd

# your data
# ===========================
df1

      SP1  Count1
0  ANDGER       2
1  CARCRA       4
2     CAR       3
3  PANVIR       1

df2

      SP2  Count2
0  CARCRA       7
1     CAR       6
2  PANVIR       4

# processing
# ==========================
pd.concat([df1.set_index('SP1'), df2.set_index('SP2')], axis=1, join='outer')

        Count1  Count2
ANDGER       2     NaN
CAR          3       6
CARCRA       4       7
PANVIR       1       4

Upvotes: 1

Related Questions