whiteatom
whiteatom

Reputation: 1455

MySQL: get oldest record from most recent group

Sorry for the confusing title, but it's the best way to explain it. This is not a usual "most recent from group" problem and I haven't been able to find anything similar on the web.

I have a status table that tracks what people are doing at various work sites. It contains records that link people, status and location.

ID, start_date, person_ID, location_ID, status
1,  2014-10-12, 1,         1,           job a
2,  2014-10-13, 2,         2,           job b
3,  2014-10-15, 1,         3,           job c
4,  2014-10-21, 1,         3,           job d
5,  2014-10-22, 2,         4,           job a
6,  2014-10-26, 2,         2,           job d

I need to be able to determine how long each person as been at the current site - I'm hoping to get results like this:

person_ID, location_ID, since
1,         3,           2014-10-15
2,         2,           2014-10-26

Getting when they started the current job is relatively easy by joining the max(start_date), but I need the min(start_date) from the jobs done at the most recent location.

I have been trying to join the min(start_date) within the records that match the current location (from the most recent record), and that works great until I have a person (like person 2) who has multiple visits to the current location... you can see in my desired results that I want the 10-26 date, not the 10-13 which is the first time they were at the site.

I need some method for matching the the job records for a given person, and then iterating back until the location doesn't match. I'm figuring there has to be some way to do this with some sub-queries and some clever joins, but I haven't been able to find it yet, so I would appreciate some help.

Upvotes: 1

Views: 596

Answers (3)

Joachim Isaksson
Joachim Isaksson

Reputation: 181077

If I understand what you're asking correctly, you could use EXISTS to eliminate all but the most recent locations per person, and get the min date from the resulting rows.

SELECT person_id, location_id, MIN(start_date) since
FROM status s
WHERE NOT EXISTS (
  SELECT 1 FROM status 
  WHERE s.person_id = person_id 
    AND s.location_id <> location_id
    AND s.start_date < start_date)
GROUP BY person_id

An SQLfiddle to test with.

Basically, it eliminates all locations and times where the same person has visited another location more recently. For example;

1,  2014-10-12, 1,         1,           job a

...is eliminated since person 1 has visited location 3 more recently, while;

3,  2014-10-15, 1,         3,           job c

...is kept since the same person has only visited the same location more recently.

It then just picks the least recent time per person. Since only the rows from the last location are kept, it will be the least recent time from the most recent location.

Upvotes: 1

Joop Eggen
Joop Eggen

Reputation: 109613

Quite simple actually.

SELECT g.person_ID,
  (SELECT l.location_ID
   FROM status l
   WHERE l.person_ID = g.person_ID
   AND l.start_date = MAX(g.start_date)) AS location,
  MAX(g.start_date) AS since
FROM status g
GROUP BY g.person_ID

This uses a grouping on person_ID, and uses a SELECT for the location column expression.

The sole question is whether you meant MIN i.o. MAX as in your example you yield the youngest date, not the oldest.

Upvotes: -1

Gordon Linoff
Gordon Linoff

Reputation: 1270873

I think the easiest way is with variables to keep track of the information you need:

select person_id, location_id, min(start_date) as since
from (select s.*,
             (@rn := if(@p <> person_id, if(@p:=person_id, 1, 1),
                        if(@l = location_id, @rn,
                           if(@l:=location_d, @rn + 1, @rn + 1)
                          )
                        )
             ) as location_counter
      from status s cross join
           (select @p := 0, @l := 0, @rn := 0) vars
      order by person_id, start_date desc
     ) s
where location_counter = 1
group by person_id, location_id;

The weird logic with the variables is (trying to) enumerate the locations for each person. It should be incrementing @rn only when the location changes and resetting the value to 1 for a new person.

Upvotes: 1

Related Questions