enola_tri
enola_tri

Reputation: 25

Merging Pandas DataFrames by column

I have two data frames:

df1 = pd.DataFrame({'dateRep': ['2020-09-10', '2020-08-10', '2020-07-10', 
                              '2020-24-03', '2020-23-03', '2020-22-03'],
                    'cases': [271, 321, 137, 
                              8, 0, 1],
                    'countriesAndTerritories': ['Kenya', 'Kenya','Kenya',
                                               'Uganda', 'Uganda', 'Uganda']})
df1

and

df2 = pd.DataFrame({'date': ['2020-15-02', '2020-16-02', '2020-17-02', 
                              '2020-08-10', '2020-07-10', '2020-06-10'],
                    'cases': [2.0, 0.0, 5.0, 
                              -3.7, -5.0, 0.0],
                    'country_refion': ['Kenya', 'Kenya','Kenya', 
                                       'Uganda', 'Uganda', 'Uganda']})
df2

which I want to combine or merge by date into one data frame:

df3 = pd.DataFrame({'date': ['2020-15-02', '2020-16-02', '2020-17-02',
                             '2020-22-03', '2020-23-03', '2020-24-03',
                             '2020-06-10', '2020-07-10', '2020-07-10', '2020-08-10', '2020-08-10', '2020-09-10'],
                    'cases': [2.0, 0.0, 5.0, 
                              1.0, 0.0, 8.0, 
                              0.0, -5.0, 137, -3.7, 321, 271],
                    'country_refion': ['Kenya', 'Kenya','Kenya',
                                       'Uganda', 'Uganda', 'Uganda',
                                       'Uganda', 'Uganda', 'Kenya', 'Uganda', 'Kenya', 'Kenya']})
df3

I tried the .join(), and .concatenate() methods, which didn't work as expected. Thank you for your help in advance!

Upvotes: 0

Views: 32

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34046

You can use df.append with df.combine_first():

In [297]: x = df1.append(df2)
In [299]: x.date = x.dateRep.combine_first(x.date)
In [301]: x.country_refion = x.country_refion.combine_first(x.countriesAndTerritories)

In [308]: x = x.sort_values('date').drop(['dateRep', 'countriesAndTerritories'], axis=1).reset_index(drop=True)

In [310]: x
Out[310]: 
    cases        date country_refion
0     0.0  2020-06-10         Uganda
1   137.0  2020-07-10          Kenya
2    -5.0  2020-07-10         Uganda
3   321.0  2020-08-10          Kenya
4    -3.7  2020-08-10         Uganda
5   271.0  2020-09-10          Kenya
6     2.0  2020-15-02          Kenya
7     0.0  2020-16-02          Kenya
8     5.0  2020-17-02          Kenya
9     1.0  2020-22-03         Uganda
10    0.0  2020-23-03         Uganda
11    8.0  2020-24-03         Uganda

Upvotes: 1

Related Questions