Dilantha
Dilantha

Reputation: 1634

How to merge multiple DataFrames in python

I have a list of Dataframes and Im trying to merge it into a one using the _id column.

List of dataframes(df) looks like

0  thyx07y1bg8   ...     2000-03-31   2004-12-31
1   ofr7s6wf1j   ...     2000-03-31   2004-12-31
    
[2 rows x 4 columns],            _id    ... calculate_from calculate_to
0  3sw1btgso6t   ...     2000-03-31   2004-12-31
1   ofr7s6wf1j   ...     2000-03-31   2004-12-31
[2 rows x 4 columns],            _id    ... calculate_from calculate_to

the result im expecting is

0  thyx07y1bg8    ...     2000-03-31   2004-12-31
1   ofr7s6wf1j    ...     2000-03-31   2004-12-31
2  3sw1btgso6t    ...     2000-03-31   2004-12-31
[3 rows x 4 columns],            _id   ... calculate_from calculate_to

I have tried

pd.concat(df)

and

reduce(lambda left, right: pd.merge(left, right, on=["_id"], how="inner"),df)

but couldnt get the result I want. any idea ? Thanks

Upvotes: 1

Views: 78

Answers (1)

BENY
BENY

Reputation: 323226

This is more like combine_first or concat with drop

out = pd.concat([df1, df2]).drop_duplicates('id')

Or

out = df1.set_index('id').combine_first(df2.set_index('id')).reset_index()

Upvotes: 2

Related Questions