Eli Turasky
Eli Turasky

Reputation: 1061

Merge two different dataframes with similar values but different dates pandas

I have two dataframes that look like this:

df1
Date    ScoreMean 06 Jul
0   2020-07-06  6.818316
1   2020-07-07  7.644018
2   2020-07-08  9.076809
3   2020-07-09  8.409158
4   2020-07-10  6.788774
5   2020-07-11  7.302806
6   2020-07-12  7.051699
7   2020-07-13  7.485968
8   2020-07-14  7.759232
9   2020-07-15  7.908419
10  2020-07-16  8.316100
11  2020-07-17  8.830133
12  2020-07-18  8.310192
13  2020-07-19  8.135894

df2
Date    ScoreMean 08 Jul
0   2020-07-08  7.844904
1   2020-07-09  6.565731
2   2020-07-10  6.791728
3   2020-07-11  6.706056
4   2020-07-12  6.545052
5   2020-07-13  7.163959
6   2020-07-14  8.199409
7   2020-07-15  8.115214
8   2020-07-16  8.234860
9   2020-07-17  8.370753
10  2020-07-18  8.036928
11  2020-07-19  7.927622
12  2020-07-20  7.890694
13  2020-07-21  7.989660

I tried pd.merge(df1, df2, on="Date"), but this produces:

    Date    ScoreMean 06 Jul    ScoreMean 08 Jul
0   2020-07-08  9.076809    7.844904
1   2020-07-09  8.409158    6.565731
2   2020-07-10  6.788774    6.791728
3   2020-07-11  7.302806    6.706056
4   2020-07-12  7.051699    6.545052
5   2020-07-13  7.485968    7.163959
6   2020-07-14  7.759232    8.199409
7   2020-07-15  7.908419    8.115214
8   2020-07-16  8.316100    8.234860
9   2020-07-17  8.830133    8.370753
10  2020-07-18  8.310192    8.036928
11  2020-07-19  8.135894    7.927622

However, what I want is for the tail ends of the dates to show up in the new dataframe along with the associated data. For example, in this case the new dataframe should contain 2020-07-06 and 2020-07-07 at the top, and 2020-07-20 and 2020-07-21 at the bottom. In days where the dates do not overlap, I would like the value to be NaN or zero. How would I accomplish this using pandas?

Upvotes: 0

Views: 37

Answers (2)

Balaji Ambresh
Balaji Ambresh

Reputation: 5037

Here you go:

df1.merge(df2, on='Date', how='outer')                                  

Output:

          Date  ScoreMean 06 Jul  ScoreMean 08 Jul
0   2020-07-06               6.8               NaN
1   2020-07-07               7.6               NaN
2   2020-07-08               9.1               7.8
3   2020-07-09               8.4               6.6
4   2020-07-10               6.8               6.8
5   2020-07-11               7.3               6.7
6   2020-07-12               7.1               6.5
7   2020-07-13               7.5               7.2
8   2020-07-14               7.8               8.2
9   2020-07-15               7.9               8.1
10  2020-07-16               8.3               8.2
11  2020-07-17               8.8               8.4
12  2020-07-18               8.3               8.0
13  2020-07-19               8.1               7.9
14  2020-07-20               NaN               7.9
15  2020-07-21               NaN               8.0

Upvotes: 1

Dan
Dan

Reputation: 45741

use the how argument of merge set to "outer":

pd.merge(df1, df2, on="Date", how="outer")

To understand the how arugment, read up about types of SQL joins.

Upvotes: 1

Related Questions