Lawrence
Lawrence

Reputation: 165

How to group base on occurrences across months?

I have the following dataframe:

userid month
user1 jan
user2 jan
user3 jan
user1 feb
user3 feb
user1 march

if user appears more than 2 months, I will group them as active, else no active. The desired output is:

userid month active
user1 jan,feb,march true
user2 jan false
user3 jan,feb false

how can i do it with pandas? pardon me if i do not have a starting code, as i am totally unsure. dont mind helping a newbie here.

Upvotes: 0

Views: 13

Answers (1)

jezrael
jezrael

Reputation: 863031

Use GroupBy.agg with join and lambda function:

df = df.groupby('userid').agg(month = ('month', ','.join), 
                              active=('month', lambda x: len(x) > 2))
print (df)
                month  active
userid                       
user1   jan,feb,march    True
user2             jan   False
user3         jan,feb   False

Or count groups and reassign boolean:

df = (df.groupby('userid').agg(month = ('month', ','.join), active=('month','size'))
        .assign(active = lambda x: x['active'].gt(2)))

Upvotes: 1

Related Questions