Calculate Frequency of item in list

Question

I wanna calculate frequency of accidents in every region, in every year. How can I do that using Python.

file.csv

Region,Year
1,2003
1,2003
2,2008
2,2007
2,2007
3,2004
1,2004
1,2004
1,2004

I tried using Counter, but it works only with one columns. Example: In region 1, year 2003 , there are 2 So results should be:

 Region,Year, freq
    1,2003,2
    1,2003,2
    2,2008,1
    2,2007,2
    2,2007,2
    3,2004,1
    1,2004,3
    1,2004,3
    1,2004,3

I tried doing it this way. But it doesn't seem to be the right way.

from collections import Counter

data = pandas.DataFrame("file.csv")
freq_year= Counter(data.year.values)
dz = [dom[x] for x in data.year.values]
data["freq"] = data["year"].apply(lambda x: dom[x])

I am thinking of using Groupby. Do you know any idea how to do this ?

Blaszard · Accepted Answer

There might be a better way, but I first append a dummy column and calculate the freq based on the column, like:

df["freq"] = 1
df["freq"] = df.groupby(["Year", "Region"]).transform(lambda x: x.sum())

This returns the following df:

  Region  Year  freq
0       1  2003     2
1       1  2003     2
2       2  2008     1
3       2  2007     2
4       2  2007     2
5       3  2004     1
6       1  2004     3
7       1  2004     3
8       1  2004     3

Calculate Frequency of item in list

Answers (2)

Related Questions