Pandas way to check if an event has already happend python

Question

Suppose that we have a Pandas DataFrame (df) as:

Is there a quick and elegant way of getting only the lines where the nickname has never had an 1 event? Like

nickname	event	Date
B	0	2020-01-02
B	0	2020-01-03

I came up with the following code to solve this problem

df_group= df.groupby('nickname')['event'].max()
df_never = df_group[df_group == 0]
df_only_never = df[df['nickname'].isin(df_never .index)]

I have the feeling that there is a one-line way of doing this with Pandas, but I'm not sure how to do it.

SeaBean · Accepted Answer

Your codes are close. Instead of using .max() which aggregate and reduce the number of rows, you can use .transform() on 'max' to make all group values equal to the maximum value (either 0 or 1 for whole group).

Then, like you did, check these values for equality with 0 to get a boolean index.

Finally, use .loc on the boolean index to locate the rows of group(s) with all zeros, as follows:

df_out = df.loc[df.groupby('nickname')['event'].transform('max') == 0]

Result:

print(df_out)

  nickname  event        Date
2        B      0  2020-01-02
3        B      0  2020-01-03

Answers (2)