Reputation: 515
I've got one large data frame with columns like so:
TimeHrs A B SeqNum
I want to figure out if the max of A for each group grouped by the SeqNum happens within +/-2 seconds of the max of B. For now, I've got a groupby going for each to obtain the rows of the max values of A and B like so:
idxmaxA = df.loc[df.groupby(['SeqNum'])['A'].idxmax()]
idxmaxB = df.loc[df.groupby(['SeqNum'])['B'].idxmax()]
This gives me two data frames with the indexes/rows of the max of A and B for each group. How would I go about comparing the results to create a column of boolean values testing if the max of A is within 2 seconds of the max of B that I can add to my original dataframe? In my case TimeHrs is 1 second data, so each row is 1 second from the last. Because of this, I thought maybe comparing the index values for each group in idxmaxA falls within +/- 2 rows of the idxmaxB, but I'm not sure how to easily do that.
Upvotes: 0
Views: 221
Reputation: 164673
Is there any reason why you can't perform this in 2 steps?
In the below example, you find the max of each column groupwise, then perform your comparison using pd.Series.between
.
import pandas as pd
df = pd.DataFrame([[8, 10, 1], [1, 4, 1], [5, 8, 1],
[5, 15, 2], [16, 10, 2], [-5, 3, 2],
[5, 10, 3], [15, 4, 3], [-5, 12, 3]],
columns=['A', 'B', 'SeqNum'])
res = df.groupby('SeqNum', as_index=False).max()
res['test'] = res['B'].between(res['A']-2, res['A']+2)
# SeqNum A B test
# 0 1 8 10 True
# 1 2 16 15 True
# 2 3 15 12 False
Upvotes: 1