How to identify a specific occurrence across two rows and calculate the count

Question

Let's say I have these 2 pandas dataframes:

id | userid | type 
1  | 20     | a  
2  | 20     | a
3  | 20     | b
4  | 21     | a  
5  | 21     | b
6  | 21     | a
7  | 21     | b
8  | 21     | b

I want to obtain the number of times 'b follows a' for each user, and obtain a new dataframe like this:

userid | b_follows_a
20     | 1
21     | 2

I know I can do this using for loop. However, I wonder if there is a more elegant solution to this.

akuiper · Accepted Answer

You can use shift() to check if a is followed by b with vectorized & and then count the trues with a sum:

df.groupby('userid').type.apply(lambda x: ((x == "a") & (x.shift(-1) == "b")).sum()).reset_index()

#userid type
#0   20    1
#1   21    2

How to identify a specific occurrence across two rows and calculate the count

Answers (2)

Related Questions