Reputation: 44742
Every time a logged in user visits the website their data is put into a table containing the userId and date (either one or zero row per user per day):
444631 2011-11-07
444631 2011-11-06
444631 2011-11-05
444631 2011-11-04
444631 2011-11-02
444631 2011-11-01
I need to have ready access to the number of consecutive visits when I pull the user data from the main user table.. In the case for this user, it would be 4.
Currently I'm doing this through a denormalized consecutivevisits
counter in the main user table, however for unknown reasons it sometimes resets.. I want to try an approach that uses exclusively the data in the table above.
What's the best SQL query to get that number (4 in the example above)? There are users who have hundreds of visits, we have millions of registered users and hits per day.
EDIT: As per the comments below I'm posting the code I currently use to do this; it however has the problem that it sometimes resets for no reason and it also reset it for everyone during the weekend, most likely because of the DST change.
// Called every page load for logged in users
public static function OnVisit($user)
{
$lastVisit = $user->GetLastVisit(); /* Timestamp; db server is on the same timezone as www server */
if(!$lastVisit)
$delta = 2;
else
{
$today = date('Y/m/d');
if(date('Y/m/d', $lastVisit) == $today)
$delta = 0;
else if(date('Y/m/d', $lastVisit + (24 * 60 * 60)) == $today)
$delta = 1;
else
$delta = 2;
}
if(!$delta)
return;
$visits = $user->GetConsecutiveVisits();
$userId = $user->GetId();
/* NOTE: t_dailyvisit is the table I pasted above. The table is unused;
* I added it only to ensure that the counter sometimes really resets
* even if the user visits the website, and I could confirm that. */
q_Query("INSERT IGNORE INTO `t_dailyvisit` (`user`, `date`) VALUES ($userId, CURDATE())", DB_DATABASE_COMMON);
/* User skipped 1 or more days.. */
if($delta > 1)
$visits = 1;
else if($delta == 1)
$visits += 1;
q_Query("UPDATE `t_user` SET `consecutivevisits` = $visits, `lastvisit` = CURDATE(), `nvotesday` = 0 WHERE `id` = $userId", DB_DATABASE_COMMON);
$user->ForceCacheExpire();
}
Upvotes: 1
Views: 1632
Reputation: 656461
I missed the mysql tag and wrote up this solution. Sadly, this does not work in MySQL as it does not support window functions.
I post it anyway, as I put some effort into it. Tested with PostgreSQL. Would work similarly with Oracle or SQL Server (or any other decent RDBMS that supports window functions).
CREATE TEMP TABLE v(id int, visit date);
INSERT INTO v VALUES
(444631, '2011-11-07')
,(444631, '2011-11-06')
,(444631, '2011-11-05')
,(444631, '2011-11-04')
,(444631, '2011-11-02')
,(444631, '2011-11-01')
,(444632, '2011-12-02')
,(444632, '2011-12-03')
,(444632, '2011-12-05');
-- add 1 to "difference" to get number of days of the longest period
SELECT id, max(dur) + 1 as max_consecutive_days
FROM (
-- calculate date difference of min and max in the group
SELECT id, grp, max(visit) - min(visit) as dur
FROM (
-- consecutive days end up in a group
SELECT *, sum(step) OVER (ORDER BY id, rn) AS grp
FROM (
-- step up at the start of a new group of days
SELECT id
,row_number() OVER w AS rn
,visit
,CASE WHEN COALESCE(visit - lag(visit) OVER w, 1) = 1
THEN 0 ELSE 1 END AS step
FROM v
WINDOW w AS (PARTITION BY id ORDER BY visit)
ORDER BY 1,2
) x
) y
GROUP BY 1,2
) z
GROUP BY 1
ORDER BY 1
LIMIT 1;
Output:
id | max_consecutive_days
--------+----------------------
444631 | 4
I later found an even better way. grp
numbers are not continuous (but continuously rising). Doesn't matter, since those are just a mean to an end:
SELECT id, max(dur) + 1 AS max_consecutive_days
FROM (
SELECT id, grp, max(visit) - min(visit) AS dur
FROM (
-- subtract an integer representing the number of day from the row_number()
-- creates a "group number" (grp) for consecutive days
SELECT id
,EXTRACT(epoch from visit)::int / 86400
- row_number() OVER (PARTITION BY id ORDER BY visit) AS grp
,visit
FROM v
ORDER BY 1,2
) x
GROUP BY 1,2
) y
GROUP BY 1
ORDER BY 1
LIMIT 1;
Upvotes: 3
Reputation: 12458
If it is not necessary to have a log of every day the user was logged on to the webiste and you only want to know the consecutive days he was logged on, I would prefer this way:
Chose 3 columns: LastVisit (Date), ConsecutiveDays (int) and User.
On log-in you check the entry for the user, determine if last visit was "Today - 1", then add 1 to the columns ConsecutiveDays and store "Today" in column LastVisit. If last vist is greater than "Today - 1" then store 1 in ConsecutiveDays.
HTH
Upvotes: 1