user3701435
user3701435

Reputation: 107

Count freq of one column values in pandas dataframe and tag each row with its frequency occurence number

I want to count freq of each element on a specific column of a pandas dataframe and then tag each row with freq occurrence number.

Most of the common solutions are how to count frequency of each element of a column like here: count the frequency that a value occurs in a dataframe column

I have a basic code like:

df = pd.DataFrame({ 'A': ['foo', 'bar', 'g2g', 'g2g', 'g2g',  
                                'bar', 'bar', 'foo', 'bar'], 
                   'B': ['a', 'b', 'a', 'b', 'b', 'b', 'a', 'a', 'b'] }) 

print(df)

which outputs:

     A  B
0  foo  a
1  bar  b
2  g2g  a
3  g2g  b
4  g2g  b
5  bar  b
6  bar  a
7  foo  a
8  bar  b

Further: df['freq'] = df.groupby('B')['B'].transform('count') outputs:

    A  B  freq
0  foo  a     4
1  bar  b     5
2  g2g  a     4
3  g2g  b     5
4  g2g  b     5
5  bar  b     5
6  bar  a     4
7  foo  a     4
8  bar  b     5

while I want something like the following after grouping by column 'B':

    A  B  freq_occurance
0  foo  a     1
1  bar  b     1
2  g2g  a     2
3  g2g  b     2
4  g2g  b     3
5  bar  b     4
6  bar  a     3
7  foo  a     4
8  bar  b     5

which means, if the value 'a' in column 'B' has frequency 4, then the first row where 'a' appears will be tagged as 1, second row having 'a' will be tagged as 2 and so on. This logic applies to all unique values under column 'B'.

Upvotes: 1

Views: 1080

Answers (2)

Allen Qin
Allen Qin

Reputation: 19957

You can use transform and take the index(after reset_index) as the value and then plus one(as new index starts from 0).

df['freq2'] = df.groupby('B')['B'].transform(lambda x: x.reset_index().index).add(1)

A   B   freq    freq2
0   foo a   4   1
1   bar b   5   1
2   g2g a   4   2
3   g2g b   5   2
4   g2g b   5   3
5   bar b   5   4
6   bar a   4   3
7   foo a   4   4
8   bar b   5   5

Upvotes: 1

Code Different
Code Different

Reputation: 93191

cumcount is what you need:

df['freq_occurance'] = df.groupby('B').cumcount() + 1

Upvotes: 0

Related Questions