MySQL - group by and count - best query

Question

We have a statistics database of which we would like to group some results. Every entry has a timestamp 'tstarted'.

We would like to group by every quarter of the day. For each quarter, we would like to know the day count where we have > 0 results (for that quarter).

We could resolve this by using a subquery:

select quarter, sum(q), count(quarter), sum(q) / count(quarter) as average
from (
    select SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900) as quarter, sum(qdelivered) as q
    from statistics 
    where stat_field = 1
    group by SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900), date(tstarted)
    order by SEC_TO_TIME((TIME_TO_SEC(tstarted) DIV 900) * 900) asc
) as sub
group by quarter

My question: is there a more efficient way to retrieve this result (e.g. join or other way)?

spencer7593 · Accepted Answer

Efficiency could be improved by eliminating the inline view (derived table aliased as sub), and doing all the work in a single query. (This is because of the way that MySQL processes the inline view, creating and populating a temporary MyISAM table.)

~~I don't understand why the expression date(tstarted) needs to be included in the GROUP BY clause; I don't see that removing that would change the result set returned by the query.~~

I do now see the effect of including the date(tstarted) in the GROUP BY of the inline view query.

I think this query returns the same result as the original:

SELECT SEC_TO_TIME((TIME_TO_SEC(s.tstarted) DIV 900) * 900) AS `quarter`
     , SUM(s.qdelivered)                                    AS `q`
     , COUNT(DISTINCT DATE(s.tstarted))                     AS `day_count`
     , SUM(s.qdelivered) / COUNT(DISTINCT DATE(s.tstarted)) AS `average`
  FROM statistics s
 WHERE s.stat_field = 1 
 GROUP BY SEC_TO_TIME((TIME_TO_SEC(s.tstarted) DIV 900) * 900)

This should be more efficient since it avoids materializing an intermediate derived table.

Your question said you wanted a "day count"; that sounds like you want a count of the each day that had a row within a particular quarter hour.

To get that, you could just add an aggregate expression to the SELECT list,

     , COUNT(DISTINCT DATE(s.tstarted))                     AS `day_count`

MySQL - group by and count - best query

Answers (2)

Related Questions