amber_the_debutant
amber_the_debutant

Reputation: 27

SQL: How can I group by consecutive years?

I have a table 'movies' with the field 'Year', 'movieId', 'movie_running_time', 'Director'...etc.

I want to select the data that show the average running time of movies per group of 4 consecutive years (ie for the years 2001-2004, 2005-2008,...). The result displays the MIN and MAX year for each group, and the avg running time converted into hours and minutes.

The group of years is expected to exclude Year 0 and start at the minimum year 2001 (2001-2004, 2005-2008,...)

Table: movies

movieId Year movie_running_time (minute)
1 2002 88
2 2004 118
3 2001 86
4 2008 110
5 2008 113
6 2008 89
7 2007 119
8 2006 103
9 2001 86
10 2002 117
11 2006 101
12 0 0
13 2007 111
14 2001 105
15 2005 134
16 2003 86
17 0 0
18 2003 93
19 2005 119
20 2003 110
21 2004 99
... ... ...

Expected result:

Year(MIN) Year(MAX) AVG(movie_running_time)
2001 2004 1:38
2005 2008 1:51
... ... ...

Upvotes: 0

Views: 139

Answers (1)

EdmCoff
EdmCoff

Reputation: 3576

I suspect you are probably looking for something like

SELECT 
 min(year) minyear, 
 max(year) maxyear, 
 avg(movie_running_time)/60 avg_hours, 
 avg(movie_running_time)%60 avg_remaining_minutes
FROM movie
GROUP BY (year-2001)/4

This assumes your years and running times are integers (and that running time is in minutes), but you can adjust depending on your actual data and table structure.

I'm also assuming that it's okay to hardcode 2001 in your query (but you could get it from elsewhere if you want).

The idea of the group by is that, for integer division, you will end up with a floor. So 0/4, 1/4, 2/4, 3/4 are all 0 (your first group), 4/4, 5/4, 6/4, 7/4 are all 1 (your second group) and so on.

The idea of the integer division and modulus in the select statement is that once you have your average (say 70 minutes), 70/60 is 1 hour (because integer division uses a floor), and 70%60 is the remaining 10 minutes.

Here's a fiddle of the basic idea.

Upvotes: 0

Related Questions