Reputation: 1455
Sorry for the confusing title, but it's the best way to explain it. This is not a usual "most recent from group" problem and I haven't been able to find anything similar on the web.
I have a status table that tracks what people are doing at various work sites. It contains records that link people, status and location.
ID, start_date, person_ID, location_ID, status
1, 2014-10-12, 1, 1, job a
2, 2014-10-13, 2, 2, job b
3, 2014-10-15, 1, 3, job c
4, 2014-10-21, 1, 3, job d
5, 2014-10-22, 2, 4, job a
6, 2014-10-26, 2, 2, job d
I need to be able to determine how long each person as been at the current site - I'm hoping to get results like this:
person_ID, location_ID, since
1, 3, 2014-10-15
2, 2, 2014-10-26
Getting when they started the current job is relatively easy by joining the max(start_date), but I need the min(start_date) from the jobs done at the most recent location.
I have been trying to join the min(start_date) within the records that match the current location (from the most recent record), and that works great until I have a person (like person 2) who has multiple visits to the current location... you can see in my desired results that I want the 10-26 date, not the 10-13 which is the first time they were at the site.
I need some method for matching the the job records for a given person, and then iterating back until the location doesn't match. I'm figuring there has to be some way to do this with some sub-queries and some clever joins, but I haven't been able to find it yet, so I would appreciate some help.
Upvotes: 1
Views: 596
Reputation: 181077
If I understand what you're asking correctly, you could use EXISTS
to eliminate all but the most recent locations per person, and get the min date from the resulting rows.
SELECT person_id, location_id, MIN(start_date) since
FROM status s
WHERE NOT EXISTS (
SELECT 1 FROM status
WHERE s.person_id = person_id
AND s.location_id <> location_id
AND s.start_date < start_date)
GROUP BY person_id
Basically, it eliminates all locations and times where the same person has visited another location more recently. For example;
1, 2014-10-12, 1, 1, job a
...is eliminated since person 1 has visited location 3 more recently, while;
3, 2014-10-15, 1, 3, job c
...is kept since the same person has only visited the same location more recently.
It then just picks the least recent time per person. Since only the rows from the last location are kept, it will be the least recent time from the most recent location.
Upvotes: 1
Reputation: 109613
Quite simple actually.
SELECT g.person_ID,
(SELECT l.location_ID
FROM status l
WHERE l.person_ID = g.person_ID
AND l.start_date = MAX(g.start_date)) AS location,
MAX(g.start_date) AS since
FROM status g
GROUP BY g.person_ID
This uses a grouping on person_ID, and uses a SELECT for the location column expression.
The sole question is whether you meant MIN i.o. MAX as in your example you yield the youngest date, not the oldest.
Upvotes: -1
Reputation: 1270873
I think the easiest way is with variables to keep track of the information you need:
select person_id, location_id, min(start_date) as since
from (select s.*,
(@rn := if(@p <> person_id, if(@p:=person_id, 1, 1),
if(@l = location_id, @rn,
if(@l:=location_d, @rn + 1, @rn + 1)
)
)
) as location_counter
from status s cross join
(select @p := 0, @l := 0, @rn := 0) vars
order by person_id, start_date desc
) s
where location_counter = 1
group by person_id, location_id;
The weird logic with the variables is (trying to) enumerate the locations for each person. It should be incrementing @rn
only when the location changes and resetting the value to 1
for a new person.
Upvotes: 1