Reputation: 123
I have two data frames, let's say, df_1 with shape (2000*4) and df_2 with shape (69*4). The data for df_1 are available per minute for 2000 minutes, however, data for df_2 are available only on certain minutes (69 data points spread over 2000 minutes). I want to merge them based on the DateTime index such that I get a final data frame of shape (2000*8).
df_1
Datetime X1 X2 X3 X4
15/1/2020 08:01:00 1 2 3 4
15/1/2020 08:02:00 5 6 7 8
15/1/2020 08:03:00 9 10 11 12
15/1/2020 08:04:00 13 14 15 16
.
.
15/1/2020 23:59:00 17 18 19 20
df_2
Datetime Y1 Y2 Y3 Y4
15/1/2020 08:01:00 A B C D
15/1/2020 09:30:00 E F G H
15/1/2020 15:03:00 I J K L
15/1/2020 18:04:00
.
.
15/1/2020 23:59:00 M N O p
output
Datetime X1 X2 X3 X4 Y1 Y2 Y3 Y4
15/1/2020 08:01:00 1 2 3 4 A B C D
15/1/2020 08:02:00 5 6 7 8 Nan Nan Nan NAn
15/1/2020 08:03:00 9 10 11 12 Nan Nan Nan nan
15/1/2020 08:04:00
15/1/2020 09:30:00
15/1/2020 15:03:00
15/1/2020 18:04:00
.
.
15/1/2020 23:59:00 17 18 19 20 M N O p
Upvotes: 0
Views: 1252
Reputation: 18367
You can perform a join or concat
. Since join
is in the comments, I'll use pd.concat()
:
final_df = pd.concat([df_1,df_2],axis=1,join='outer')
Here's an example:
import pandas as pd
df1 = pd.DataFrame({'index':['A','B','C','D','E','F'],"A":[1,2,3,4,5,6]}).set_index('index')
df2 = pd.DataFrame({'index':['B','D','F'],"B":[20,30,40]}).set_index('index')
df_output = pd.concat([df1,df2],axis=1,join='outer')
Output:
A B
A 1 NaN
B 2 20.0
C 3 NaN
D 4 30.0
E 5 NaN
F 6 40.0
Upvotes: 2