lucky1928
lucky1928

Reputation: 8841

count word frequency with groupby

I have a csv file only one tag column:

tag
A
B
B
C
C
C
C

When run groupby to count the word frequency, the output do not have the frequency number

#!/usr/bin/env python3
import pandas as pd

def count(fname):
    df = pd.read_csv(fname)
    print(df)
    dfg = df.groupby('tag').count().reset_index()
    print(dfg)
    return
count("save.txt")

Output no frequency column:

  tag
0   A
1   B
2   B
3   C
4   C
5   C
6   C
  tag
0   A
1   B
2   C

expect output:

  tag  freq
0   A  1
1   B  2
2   C  4

Upvotes: 0

Views: 134

Answers (3)

Nathan Furnal
Nathan Furnal

Reputation: 2410

You should use value_counts() and not count()

df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"})

outputs:

  tag  freq
0   A     1
1   B     2
2   C     4

To sort in descending order,

df.groupby("tag").value_counts().reset_index().rename(columns={0: "freq"}).sort_values(
    by="freq", ascending=False
)

Upvotes: 1

tylerjames
tylerjames

Reputation: 123

You could create the addtional column then count values:

Input:

df['freq'] = 1
df = df['tag'].value_counts()

Output:

    tag freq
0     C    4
1     B    2
2     A    1

Upvotes: 2

Simon
Simon

Reputation: 1211

Looks close to me, per my comment:

df = pd.DataFrame({'tag': ['A', 'B', 'B', 'C', 'C', 'C', 'C']})

df.groupby(['tag'], as_index=False).agg(freq=('tag', 'count'))

Upvotes: 2

Related Questions