How to get the first item in a group by that meets a certain condition in pandas?

Question

I have the following code:

grouped_stats = stats.groupby( stats.last_mv.ne( stats.last_mv.shift()).cumsum() )

last_mv is a decimal value In the code above I am grouping by consecutive values

I am trying two ways to obtain the first value that is 0.25% above the first item in the groups last_mv value. In other words, I have grouped by consecutive last_mv values, I want to select the first of each group, multiply by 1.025 and then find the first value within the group that matches this value (if one exists)

I tried:

grouped_stats.filter(lambda x: x.last_mv >= (x.first().last_mv * 1.025))

but I can't access the first row in the group with .first() as I assumed I wouldnt

I also tried

grouped_stats.loc[ grouped_stats.last_mv >= (grouped_stats.first().last_mv * 1.025) ]

but I get the error: "Cannot access callable attribute 'loc' of 'DataFrameGroupBy' objects, try using the 'apply' method"

jezrael · Accepted Answer

I believe you need transform for Series with same size like original DataFrame filled by first values per groups:

stats[ stats.last_mv >= (grouped_stats.last_mv.transform('first') * 1.025) ]

How to get the first item in a group by that meets a certain condition in pandas?

Answers (1)

Related Questions