Reputation: 649
Assume a simple table as follows containing status events for two users. A status_id of 1 makes them 'active', anything else makes them de-active. I need to find out all those users that became inactive within one year of, for example, 2015-05-01 (not including that date).
CREATE TABLE user_status(
user_id INT,
status_id INT,
date_assigned VARCHAR(10) );
INSERT INTO user_status( user_id, status_id, date_assigned)
VALUES
(1234, 1, '2015-01-01'), -- 1234 becomes active (status id = 1)
(1234, 2, '2015-07-01'), -- 1234 de-activated for reason 2
(5678, 1, '2015-02-01'), -- 5678 becomes active (status id = 1)
(5678, 3, '2015-04-01'), -- 5678 de-activated for reason 3
(5678, 5, '2015-06-01'); -- 5678 de-activated for reason 5
Using the query
SELECT t1.*
FROM user_status t1
WHERE t1.date_assigned = (SELECT MIN(t2.date_assigned) -- the first occurrence
FROM user_status t2
WHERE t2.user_id = t1.user_id -- for this user
AND t2.status_id <> 1 -- where status not active
AND t2.date_assigned BETWEEN -- within 1 yr of given date
'2015-05-01' + INTERVAL 1 DAY -- (not including that date)
AND
'2015-05-01' + INTERVAL 1 YEAR
)
I can get the result
user_id status_id date_assigned
1234 2 2015-07-01
5678 5 2015-06-01
This is sort of right but user 5678 should not be there because although they had an inactive event within the date range, they were already inactive before the desired date range began and so did not become inactive within that range.
I need to add a bit to my query along the lines of 'only show me those users who had an inactive event and where the previous status_id for that user was 1, ie they were active at the time the inactive event happened.
Can anyone help me to get the syntax correct?
See SQL fiddle
Upvotes: 1
Views: 1637
Reputation: 461
Self join solution : finding minimum ( the first time the status changed) within your date criteria:
select a.user_id,b.status_id,max(b.date_assigned)
from user_status a
inner join user_status b
on a.user_id=b.user_id
and a.date_assigned <b.date_assigned
where b.status_id >1 and a.status_id=1
group by a.user_id,b.status_id
having max(b.date_assigned)> '2015-05-01'
and max(b.date_assigned) <='2016-05-01'
Upvotes: 1
Reputation: 72175
You can add NOT EXISTS
to your query:
SELECT t1.*
FROM user_status t1
WHERE t1.date_assigned = (SELECT MIN(t2.date_assigned) -- the first occurance
FROM user_status t2
WHERE t2.user_id = t1.user_id -- for this user
AND t2.status_id <> 1 -- where status not active
AND t2.date_assigned BETWEEN -- within 1 yr of given date
'2015-05-01' + INTERVAL 1 DAY -- (not including that date)
AND
'2015-05-01' + INTERVAL 1 YEAR
)
AND NOT EXISTS (SELECT 1 -- such a record should not exist
FROM user_status t3
WHERE t3.user_id = t1.user_id -- for this user
AND t3.status_id <> 1 -- where status is not active
AND t3.date_assigned < -- before the examined period
'2015-05-01' + INTERVAL 1 DAY )
Edit:
You can use the following query that also considers the case of having multiple activation dates:
SELECT *
FROM user_status
WHERE (user_id, date_assigned) IN (
-- get last de-activation date
SELECT t1.user_id, MAX(t1.date_assigned)
FROM user_status AS t1
JOIN (
-- get last activation date
SELECT user_id, MAX(date_assigned) AS activation_date
FROM user_status
WHERE status_id = 1
GROUP BY user_id
) AS t2 ON t1.user_id = t2.user_id AND t1.date_assigned > t2.activation_date
GROUP BY user_id
HAVING MAX(date_assigned) BETWEEN '2015-05-01' + INTERVAL 1 DAY AND '2015-05-01' + INTERVAL 1 YEAR AND
MIN(date_assigned) > '2015-05-01' + INTERVAL 1 DAY)
Upvotes: 1