Reputation: 181

Python: Percent of Applications/Any record to their User Analysis

Good afternoon all, i have a dataframe below.

UserId  Application
    1       apple
    1       orange
    1       apple
    1       pear
    2       apple
    2       orange
    2       pear
    2       grapefruit
    3       apple
    3       grapefruit
    3       apple
    1       apple

I am trying to create a list that counts every unique application to the percent of UserIDs that have them. As an example of the output the table is below

Application    Percentage
apple              100
orange             66
pear               66 
grapefruit         66

This output is telling me that for every user, apple appears 100 percent of the time orange appears 66 percent of the time. etc etc but somehow i cannot get this to work.

My code below works but produces 3.0 as a value.

dfsearch['Percentage'] = (len(dfsearch.Application.value_counts())/len(dfsearch.UserID.value_counts()))
dfsearch

this is probably incorrect because its not a list, but that is why i need help :)

Upvotes: 0

Answers (2)

jxc

Reputation: 13998

Use groupby() + nunique():

dfsearch.groupby("Application").UserID.nunique()/dfsearch.UserID.nunique()
#Application
#apple         1.000000
#grapefruit    0.666667
#orange        0.666667
#pear          0.666667
#Name: UserId, dtype: float64

Upvotes: 0

perl

Reputation: 9941

You can start with dropping duplicate records with drop_duplicates, then call value_counts, divide by the number of unique users and multiply by 100:

x = df.drop_duplicates()['Application'].value_counts() / len(df['UserId'].unique()) * 100
x

Output:

apple         100.000000
pear           66.666667
grapefruit     66.666667
orange         66.666667
Name: Application, dtype: float64

And then convert it to DataFrame:

x.astype(int).to_frame('Percentage').rename_axis('Application').reset_index()

Output:

  Application  Percentage
0       apple         100
1        pear          66
2  grapefruit          66
3      orange          66

Upvotes: 3

Python: Percent of Applications/Any record to their User Analysis

Answers (2)

Related Questions