Carlo Silanu
Carlo Silanu

Reputation: 79

How to find the average movie rating per year?

Hi so I am currently working on IMDB movie metadata dataframe that looks like this.

enter image description here

I need help finding average imdb rating per year. imdb_score per title_year.

Now I have counted the number of movies per year and dropped years with less than 10 movies for it to be relevant.

I did as follows

years = df['title_year'].value_counts()
years

and then

years2 = years[years >= 10]
years2

which resulted in

2009.0    260
2014.0    252
2006.0    239
2013.0    237
2010.0    230
2015.0    226
2011.0    225
2008.0    225
2012.0    221
2005.0    221
2004.0    214
2002.0    209
2007.0    204
2001.0    188
2000.0    171
2003.0    169
1999.0    168
1998.0    134
1997.0    118
2016.0    106
1996.0     99
1995.0     70
1994.0     54
1993.0     48
1992.0     34
1981.0     33
1989.0     33
1987.0     32
1991.0     31
1988.0     31
1984.0     31
1982.0     30
1990.0     30
1985.0     29
1986.0     26
1980.0     24
1983.0     22
1978.0     16
1977.0     16
1979.0     16
1970.0     12
1971.0     11
1968.0     11
1969.0     10
1964.0     10
1976.0     10
Name: title_year, dtype: int64

Now I am confused how do you find the average imdb rating per year because I would like to plot a graph afterwards. Can anybody help me?

Upvotes: 0

Views: 1210

Answers (1)

FBruzzesi
FBruzzesi

Reputation: 6495

You can use pandas.DataFrame.groupby:

year_avg_score = df.loc[df['title_year'].isin(year2.index)].groupby('title_year')['imdb_score'].mean()

Step by step:

  • df.loc[df['title_year'].isin(year2.index)] filters only the years with more than 10 movies, which you already computed.
  • Then we groupby title_year and select the imdb_score column.
  • Finally take the imdb_score average.

The resulting dataframe year_avg_score will have the year as index and the average score as column.

Upvotes: 1

Related Questions