Reputation: 817
I have a Pandas dataframe that I'm grouping by two columns: ID and Item. I am then keeping the last 5 occurrences in each group (maximum), and would finally like to calculate a ratio of the number of occurrences of one specific type.Specifically, I would like to calculate the count of 'U' for each ID-item divided by the total number of occurrences.
I am having trouble with the last step - I am not sure how to get from the general value counts below to a ratio for each ID-Item grouping. Any tips would be much appreciated - thank you!
d={'ID':[1,1,1,1,1,2,2,2,2,2],'Item': ['box','box','ball','ball','box','box','box','ball','box','ball'],'Direction':['D','U','U','D','U','U','D','D','D','D']}
df=pd.DataFrame(data=d)
z=company_item['revision_arrow'].value_counts()
groups=df.groupby(['ID','Item'])
max_count=5
lastfive=groups.head(max_count).reset_index(drop=True)
#count_series=lastfive['Direction'].value_counts()
count_series=groups['Direction'].value_counts()
Upvotes: 2
Views: 2571
Reputation: 51175
When you use value_counts
, you have the option to normalize the results. You can use this parameter, and then index the resulting DataFrame to only include the U
rows:
out = (df.groupby(['ID', 'Item'])
.Direction.value_counts(normalize=True)
.rename('ratio').reset_index())
out.loc[out.Direction.eq('U')]
ID Item Direction ratio
1 1 ball U 0.500000
2 1 box U 0.666667
6 2 box U 0.333333
Upvotes: 5
Reputation: 1433
If you want calculate only count('U') for each ID group divided into all count('U'), you can use apply with your defined function:
d={'ID':[1,1,1,1,1,2,2,2,2,2],'Item': ['box','box','ball','ball','box','box','box','ball','box','ball'],'Direction':['D','U','U','D','U','U','D','D','D','D']}
df=pd.DataFrame(data=d)
def func(df):
return len(df[df.Direction == 'U'])
df.groupby('ID').apply(func) / len(df[df.Direction == 'U'])
Output:
ID
1 0.75
2 0.25
You can group by ID and Item, if you need.
Upvotes: 0