7weight
7weight

Reputation: 25

select most recent records with unique user id

I have a database table that contains multiple entries for each user during a period of time. The query I'm trying to build will only return the most recent record for each user during that time period.

what I have thus far:

SELECT *
FROM (SELECT DISTINCT * FROM picks_recorded)
WHERE date_submitted > '" . $week_start . "'
AND date_submitted < '" . $week_end . "'
ORDER BY date_submitted DESC
LIMIT 1;

I've tried several versions of this without success.

Upvotes: 2

Views: 2458

Answers (3)

FuzzyTree
FuzzyTree

Reputation: 32392

You can get the most recent row by user during a certain period by using not exists. i.e. select only those rows during a period where a more recent row for that user does not exist

SELECT *
FROM picks_recorded p1
WHERE date_submitted > '" . $week_start . "'
AND date_submitted < '" . $week_end . "'
AND NOT EXISTS (
    SELECT 1 FROM picks_recorded p2
    WHERE p2.user_id = p1.user_id
    AND p2.date_submitted > p1.date_submitted
    AND p2.date_submitted < '" . $week_end . "'
)

Upvotes: 0

Edrich
Edrich

Reputation: 242

Try to change DISTINCT * into this DISTINCT(userid).
For example:

SELECT * FROM (SELECT DISTINCT(userid) FROM picks_recorded)
WHERE
    date_submitted > '" . $week_start . "'
    AND
    date_submitted < '" . $week_end . "'
ORDER BY date_submitted DESC
LIMIT 1;

Upvotes: 0

shree.pat18
shree.pat18

Reputation: 21757

Try this:

select distinct p.* from 
picks_recorded p
inner join
(select userid, max(date_submitted) as maxdate from picks_recorded
WHERE date_submitted > '" . $week_start . "'
AND date_submitted < '" . $week_end . "'
group by userid) s on p.userid = s.userid and p.date_submitted = s.maxdate

Basically, we use group by to get the maximum value of date_submitted per userid i.e. date of latest record for each user. Then we inner join with the source table based on the userid and submitted_date to get all the required data.

Note: Using distinct in the outer query will ensure that only one row is returned for a given user if that user has multiple, identical records for the latest date. However, if you have 2 records with same userid and submitted_date, but even one different field, then both records will be returned. In that case, you need to provide additional logic to say which of the 2 records should be kept in the final result.

On a side note, you should avoid building queries by concatenating with variables. Instead, you should use prepared statements/parameterised queries.

Upvotes: 2

Related Questions