Reputation: 450
I have two DataFrame as per the below code.
Key_DF = pd.DataFrame({'TC': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'F', 5: 'G'}, 'D_time': {0: '2/5/2021 10:00', 1: '2/5/2021 22:00', 2: '2/7/2021 11:35', 3: '2/8/2021 11:35', 4: '2/9/2021 11:35', 5: '2/10/2021 11:35'}, 'FName': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'A', 5: 'B'}})
Main_DF = pd.DataFrame({'Test Case': {0: 'A', 1: 'A', 2: 'B', 3: 'D', 4: 'D', 5: 'G', 6: 'G'}, 'Timestamp': {0: datetime.datetime(2021, 2, 5, 9, 34, 25), 1: datetime.datetime(2021, 2, 5, 14, 34, 25), 2: 'Wed Nov 25 17:30:12 2020', 3: '11/30/2020 11:48:38 AM', 4: 'Mon Feb 8 13:39:00 2021', 5: 'Mon Feb 9 15:42:50 2021', 6: 'Wed Dec 2 14:56:26 2020'}})
Key_DF.D_time = pd.to_datetime(Key_DF.D_time)
Main_DF.Timestamp = pd.to_datetime(Main_DF.Timestamp)
print (Key_DF)
print (Main_DF)
Need to do the following operations with "Main_DF".
Key_DF
(Ex: "1-1.1" & "2/5/2021 10:00")Main_DF
Main_DF.Timestamp
> Key_DF.D_time
Main_DF
.The final output should be, as per the following, where Main_DF.Timestamp
> Key_DF.D_time
condition should be satisfied.
I am ok with any format of Timestamp column here.
Upvotes: 0
Views: 119
Reputation: 62403
datetime64[ns] dtype
dtypes
with .info()
'TC'
and 'Test Case'
'TC'
column isn't added as a separate column when merging dataframes, it will be renamed to 'Test Case'
df.Timestamp <= df.D_time
or df.D_time.isna()
df.D_time.isna()
will keep rows where the 'Timestamp'
column has no matching time in the 'D_time
column.Main_DF.Timestamp > Key_DF.D_time
is the same as keeping values where df.Timestamp <= df.D_time
'G'
.'TC'
column, as shown in the OP'FName'
column, so it is disregarded.# merged the two dataframes
df = Main_DF.merge(Key_DF[['TC', 'D_time']].rename(columns={'TC': 'Test Case'}), on='Test Case', how='left')
# display(df)
Test Case Timestamp D_time
0 A 2021-02-05 09:34:25 2021-02-05 10:00:00
1 A 2021-02-05 14:34:25 2021-02-05 10:00:00
2 B 2020-11-25 17:30:12 2021-02-05 22:00:00
3 D 2020-11-30 11:48:38 2021-02-08 11:35:00
4 D 2021-02-08 13:39:00 2021-02-08 11:35:00
5 G 2021-02-09 15:42:50 2021-02-10 11:35:00
6 G 2020-12-02 14:56:26 2021-02-10 11:35:00
# filter the dataframe to keep data where Timestame is <= to D_time
df = df[(df.Timestamp <= df.D_time) | df.D_time.isna()].drop(columns=['D_time']).reset_index(drop=True)
# display(df)
Test Case Timestamp
0 A 2021-02-05 09:34:25
1 B 2020-11-25 17:30:12
2 D 2020-11-30 11:48:38
3 G 2021-02-09 15:42:50
4 G 2020-12-02 14:56:26
Upvotes: 1