Reputation: 49
This is a sample of a larger dataframe. I have a dataframe in a groupby of 4's
gr = df.groupby(np.arange(len(df.index))//4)
I'm trying to get the count of Letters for that unique letter in each group of 4's into a new column called Count.
I've tried a couple things but still cant get it into the end form. Any help would be appreciate. I've tried the df.duplicate, value_count, but the form of it won't come out as desired. Any ideas? Maybe a value_count and then mapping the Letters to the right places?
Also I want to avoid for loops. I'm looking more for a pandas dataframe trick or maybe a lambda x function that might work.
import pandas as pd
import numpy as np
data= [['X',1000],
['X',60],
['X',60],
['Y',40],
['X',60],
['Z',30],
['X',60],
['Z',30],
['X',60],
['Z',30],
['Y',40],
['S',25]]
df = pd.DataFrame(data,columns=['Letter','Cost'])
#groupby4s
gr = df.groupby(np.arange(len(df.index))//4)
The form I'm looking for:
Letter Cost Count
0 X 60 3
1 X 60 3
2 X 60 3
3 Y 40 1
4 X 60 2
5 Z 30 2
6 X 60 2
7 Z 30 2
8 X 60 1
9 Z 30 1
10 Y 40 1
11 S 25 1
Upvotes: 1
Views: 57
Reputation: 153460
Try using groupby
with transform
and count
:
df['Count'] = df.groupby([np.arange(len(df.index))//4, df['Letter']])['Letter']\
.transform('count')
Output:
Letter Cost Count
0 X 1000 3
1 X 60 3
2 X 60 3
3 Y 40 1
4 X 60 2
5 Z 30 2
6 X 60 2
7 Z 30 2
8 X 60 1
9 Z 30 1
10 Y 40 1
11 S 25 1
Upvotes: 1