Reputation: 329
I have 2 dataframes I would like to analyze.
df1:
City Time Day
5866128 Los Angeles 3.5 01/09/2019
5172728 New York 14 09/09/2019
4787428 Boston 9 10/09/2019
df2:
City Time Day
5866128 Los Angeles 3.5 01/09/2019
2478987 Denver 10 07/09/2019
5172728 New York 24 09/09/2019
4787428 Boston 4 10/09/2019
1478712 Austin 7 10/09/2019
I would like to create a third dataframe that would contain only the rows where df2['Time']-df1['Time'] != 0
based on the index and the rows that are only available in df2.
Desired output :
City Time Day
2478987 Denver 10 07/09/2019
5172728 New York 10 09/09/2019
4787428 Boston -5 10/09/2019
1478712 Austin 7 10/09/2019
I tried to play with numpy.where(), but I can't make it work to compare only the same index.
Thanks
Upvotes: 1
Views: 51
Reputation: 2757
df2['Time'] = df2['Time'].sub(df1['Time'], fill_value=0)
df2[df2.Time.ne(0)]
or
df2.assign(Time = df2['Time'].sub(df1['Time'], fill_value= 0)).loc[lambda x:x.Time.ne(0)]
Upvotes: 2
Reputation: 13387
Try this one:
df3 = (df1.join(df2, rsuffix="_2")
df3 = df3.loc[df["Time"]!=df["Time_2"]]
df3["Time"]=df3["Time_2"]-df3["Time"]
df3=df3.drop(["Time_2", "Day_2", "City_2"], axis=1)
Upvotes: 0