Loly
Loly

Reputation: 135

Calculating groups in Dataframe

I have a task here I have a data frame containing data about visits in a particular site. Here's a sample:

visitsite userid timeonsite
facebook.com kahy68 91973
facebook.com jjsga12 2895

I need to create cohorts(groups) based on timeonsite(presented in seconds) column. I need also to calculate how many users are in each cohort and what is their share out of all users.

An output example:

visitdurationcohort 1000-2000 2000-3000 3000-5000 5000+
usersquantity 1383 9973 3899 684
shareofusers 7% 60% 30% 3%

So i found exampkes on how to create cohorts out of a specific value (a month of registartion for example), but not in how to create a range cohort.

I will apreciate any help :)

Upvotes: 0

Views: 49

Answers (1)

braml1
braml1

Reputation: 584

As per @raymond-kwok:

bins = [0,1000,2000, 3000, 5000,10000]
df1 = df.groupby(pd.cut(df["timeonsite"], bins)).count()
df1 = df1[["userid"]]
df1["shareofusers"] = df1["userid"]/(df1["userid"].sum())
df1 = df1.T

Upvotes: 1

Related Questions