Reputation: 107
I want to count freq of each element on a specific column of a pandas dataframe and then tag each row with freq occurrence number.
Most of the common solutions are how to count frequency of each element of a column like here: count the frequency that a value occurs in a dataframe column
I have a basic code like:
df = pd.DataFrame({ 'A': ['foo', 'bar', 'g2g', 'g2g', 'g2g',
'bar', 'bar', 'foo', 'bar'],
'B': ['a', 'b', 'a', 'b', 'b', 'b', 'a', 'a', 'b'] })
print(df)
which outputs:
A B
0 foo a
1 bar b
2 g2g a
3 g2g b
4 g2g b
5 bar b
6 bar a
7 foo a
8 bar b
Further: df['freq'] = df.groupby('B')['B'].transform('count')
outputs:
A B freq
0 foo a 4
1 bar b 5
2 g2g a 4
3 g2g b 5
4 g2g b 5
5 bar b 5
6 bar a 4
7 foo a 4
8 bar b 5
while I want something like the following after grouping by column 'B':
A B freq_occurance
0 foo a 1
1 bar b 1
2 g2g a 2
3 g2g b 2
4 g2g b 3
5 bar b 4
6 bar a 3
7 foo a 4
8 bar b 5
which means, if the value 'a' in column 'B' has frequency 4, then the first row where 'a' appears will be tagged as 1, second row having 'a' will be tagged as 2 and so on. This logic applies to all unique values under column 'B'.
Upvotes: 1
Views: 1080
Reputation: 19957
You can use transform and take the index(after reset_index) as the value and then plus one(as new index starts from 0).
df['freq2'] = df.groupby('B')['B'].transform(lambda x: x.reset_index().index).add(1)
A B freq freq2
0 foo a 4 1
1 bar b 5 1
2 g2g a 4 2
3 g2g b 5 2
4 g2g b 5 3
5 bar b 5 4
6 bar a 4 3
7 foo a 4 4
8 bar b 5 5
Upvotes: 1
Reputation: 93191
cumcount
is what you need:
df['freq_occurance'] = df.groupby('B').cumcount() + 1
Upvotes: 0