TSFEN
TSFEN

Reputation: 49

Count of Items within a Groupby

This is a sample of a larger dataframe. I have a dataframe in a groupby of 4's

gr = df.groupby(np.arange(len(df.index))//4)

I'm trying to get the count of Letters for that unique letter in each group of 4's into a new column called Count.

I've tried a couple things but still cant get it into the end form. Any help would be appreciate. I've tried the df.duplicate, value_count, but the form of it won't come out as desired. Any ideas? Maybe a value_count and then mapping the Letters to the right places?

Also I want to avoid for loops. I'm looking more for a pandas dataframe trick or maybe a lambda x function that might work.

import pandas as pd
import numpy as np

data= [['X',1000],
['X',60],
['X',60],
['Y',40],
['X',60],
['Z',30],
['X',60],
['Z',30],
['X',60],
['Z',30],
['Y',40],
['S',25]]

df = pd.DataFrame(data,columns=['Letter','Cost'])

#groupby4s
gr = df.groupby(np.arange(len(df.index))//4)

The form I'm looking for:

    Letter  Cost  Count 
0        X    60      3
1        X    60      3
2        X    60      3
3        Y    40      1
4        X    60      2
5        Z    30      2
6        X    60      2
7        Z    30      2
8        X    60      1
9        Z    30      1
10       Y    40      1
11       S    25      1

Upvotes: 1

Views: 57

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Try using groupby with transform and count:

df['Count'] = df.groupby([np.arange(len(df.index))//4, df['Letter']])['Letter']\
                .transform('count')

Output:

   Letter  Cost  Count
0       X  1000      3
1       X    60      3
2       X    60      3
3       Y    40      1
4       X    60      2
5       Z    30      2
6       X    60      2
7       Z    30      2
8       X    60      1
9       Z    30      1
10      Y    40      1
11      S    25      1

Upvotes: 1

Related Questions