AMarch
AMarch

Reputation: 153

mySQL, grouping data by date

Hello: there seem to be a lot of variations of this question already on SO, but I haven't been able to adapt any of them for my needs (not in a way that works, anyway).

I have a table like this:

CREATE TABLE `AdServing`
(
  `imgID`           int UNSIGNED NOT NULL,
  `ServedTimeStamp` timestamp NOT NULL,
  `UserID`          int(10) UNSIGNED NOT NULL,
  `ServedIP`        VARBINARY(16) NULL,
  `Clicked`         Boolean NULL,
)

Some sample data:

1   2015-08-04 14:59:48     1   192.168.0.4     1
1   2015-08-14 23:06:25     1   192.168.0.3     1
1   2015-08-16 15:00:09     1   192.168.0.3     1
1   2015-08-16 15:01:15     1   192.168.0.3     1
1   2015-08-16 15:01:43     1   192.168.0.3     1
1   2015-08-17 15:00:51     1   192.168.0.4     1
2   2015-07-26 17:55:21     1   192.168.0.2     0
3   2015-07-26 17:57:15     1   192.168.0.3     1
4   2015-07-26 17:57:15     1   192.168.0.4     1

I want to create a chart showing the number of ads served over time - from the date of the very first display until the date of the most recent display. Something like this:

Chart

For some users this might only be a week. For others it could be 4 or 5 months. For another user this might span a time period of a year or two.

I figure the chart will have perhaps 6 or 7 data points along the time axis regardless of the time period - so the intervals might be 1 day for a user with only a week's worth of data, or 1 week for a user with a couple of months of data... I'm not sure about the best strategy for determining this - what about a user with only a month of data, for example, how would you partition the days into six/seven groups in a way that would make sense?

But the main problem at the moment is querying the database to extract the table values into these groupings. What I'd like to get out is something like this:

DateGrouping  |  Servings
-------------------------
1             |  2
2             |  5
3             |  3
4             |  7
5             |  2
6             |  4
7             |  3

...where 'DateGrouping' would be whatever interval is appropriate for the data for the current user, i.e., for a week of values, 1-7 would represent single days. And 'Servings' would be the number of ads served for each interval. Again, assuming each interval is a day here, that would be the number of ads served during each recorded 24 hour period (which may or may not be contiguous).

I tried to adapt the query from this SO answer: https://stackoverflow.com/a/30335592/2577646

And came up with this:

SELECT x.DateRange, COALESCE(TotalWithinRange, 0) AS TotalWithinRange
FROM (
  SELECT '1' AS DateRange 
  UNION SELECT '2'
  UNION SELECT '3'
  UNION SELECT '4'
  UNION SELECT '5'
  UNION SELECT '6'
  UNION SELECT '7') x
LEFT JOIN (  
   SELECT
      CASE when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 1 DAY AND Max(`ServedTimeStamp`)) then '1'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 2 DAY AND Max(`ServedTimeStamp`) - INTERVAL 1 DAY) then '2'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 3 DAY AND Max(`ServedTimeStamp`) - INTERVAL 2 DAY) then '3'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 4 DAY AND Max(`ServedTimeStamp`) - INTERVAL 3 DAY) then '4'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 5 DAY AND Max(`ServedTimeStamp`) - INTERVAL 4 DAY) then '5'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 6 DAY AND Max(`ServedTimeStamp`) - INTERVAL 5 DAY) then '6'
       when (`ServedTimeStamp` BETWEEN Max(`ServedTimeStamp`) - INTERVAL 7 DAY AND Max(`ServedTimeStamp`) - INTERVAL 6 DAY) then '7'
  END AS DateRange,
  COUNT(*) as TotalWithinRange
FROM `AdServing`
WHERE `UserID` = 123
GROUP BY 1 ) y ON x.DateRange = y.DateRange

But mySQL complains: "Can't group on 'DateRange'" - which I don't entirely understand. I'm very much a novice with SQL. And most other things.

Anyway, I'd be very grateful for any help. And apologies for the long-winded question.

Upvotes: 1

Views: 142

Answers (1)

Uueerdo
Uueerdo

Reputation: 15961

A generic version, since I am kinda swamped today:

SELECT MIN(dtField), MAX(dtField) INTO @first, @last 
FROM theTable 
WHERE [conditions]
;

SET @interval := DATEDIFF(@last, @first) / [number of datapoints you want];

SELECT DATEDIFF(dtField, @first) DIV @interval AS dateRange
, COUNT(1) AS TotalWithinRange
FROM theTable
WHERE [same conditions as the first query]
GROUP BY dateRange
;

Edit 1: Oops, missed the divisor on interval.

Edit 2: Note, this assumes at least 7 days. If you want better granualarity, you could use additional functions to narrow it down to seconds precision.

Upvotes: 1

Related Questions