Thomas Zoechling
Thomas Zoechling

Reputation: 34253

Getting rows with the highest SELECT COUNT from groups within a resultset

I have a SQLite Database that contains parsed Apache log lines.

A simplified version of the DB's only table (accesses) looks like this:

|referrer|datestamp|
+--------+---------+
|xy.de   | 20170414|
|ab.at   | 20170414|
|xy.de   | 20170414|
|xy.de   | 20170414|
|12.com  | 20170413|
|12.com  | 20170413|
|xy.de   | 20170413|
|12.com  | 20170413|
|12.com  | 20170412|
|xy.de   | 20170412|
|12.com  | 20170412|
|12.com  | 20170412|
|ab.at   | 20170412|
|ab.at   | 20170412|
|12.com  | 20170412|
+--------+---------+

I am trying to retrieve the top referrer for each day by performing a sub query that does a SELECT COUNT on the referrer. Afterwards I select the entries from that subquery that have the highest count:

SELECT datestamp, referrer, COUNT(*)
FROM accesses WHERE datestamp BETWEEN '20170414' AND '20170414'
GROUP BY referrer
HAVING COUNT(*) = (select MAX(anz) 
                   FROM (SELECT COUNT(*) anz 
                         FROM accesses
                         WHERE datestamp BETWEEN '20170414' AND '20170414'
                         GROUP BY referrer
                        )
                  );

The above approach works as long as I perform the query for a single date, but it falls apart as soon as I query for date ranges. How can I achieve grouping by date? I am also only interested in the referrer with the highest count.

Upvotes: 0

Views: 44

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

If you want all the days combined with a single best referrer, then:

SELECT referrer, COUNT(*) as anz 
FROM accesses
WHERE datestamp BETWEEN '20170414' AND '20170414'
GROUP BY referrer
ORDER BY COUNT(*) DESC
LIMIT 1;

I think you might want this information broken out by day. If so, a correlated subquery helps -- and a CTE as well:

WITH dr as (
      SELECT a.datestamp, a.referrer, COUNT(*) as cnt
      FROM accesses a
      WHERE datestamp BETWEEN '20170414' AND '20170414'
      GROUP BY a.referrer, a.datestamp
     )
SELECT dr.*
FROM dr
WHERE dr.cnt = (SELECT MAX(dr2.cnt)
                FROM dr dr2
                WHERE dr2.datestamp = dr.datestamp
               );

Upvotes: 2

Charles Bretana
Charles Bretana

Reputation: 146499

Just group by a date range. As an example,

SELECT referrer, 
   case when datestamp Between '20170101' AND '20170131' then 1
        when datestamp Between '20170201' AND '20170228' then 2
        when datestamp Between '20170301' AND '20170331' then 3
        else 4 end DateRange
   COUNT(*) as anz 
FROM accesses
GROUP BY referrer,
   case when datestamp Between '20170101' AND '20170131' then 1
        when datestamp Between '20170201' AND '20170228' then 2
        when datestamp Between '20170301' AND '20170331' then 3
        else 4 end
ORDER BY referrer, COUNT(*) DESC
LIMIT 1;

You can put any legal SQL expression in a group by clause. This causes the Query processor to create individual buckets to aggregate the raw data into according to value of the group by expression.

Upvotes: 1

Related Questions