Reputation: 7639
I'm struggling with what should be a simple query.
An event table stores user activity in an application. Each click generates a new event and datetime stamp. I need to show a list of recently accessed records having the most recent datetime stamp. I need to only show the past 7 days of activity.
The table has an auto-increment field (eventID), which corresponds with the date_event field, so it's better to use that for determining the most recent record in the group.
I found that some records are not appearing in my results with the expected most recent datetime. So I stripped my query down the basics:
NOTE that the real-life query does not look at custID. I am including it here to narrow down on the problem.
SELECT
el.eventID,
el.custID,
el.date_event
FROM
event_log el
WHERE
el.custID = 12345 AND
el.userID=987
GROUP BY
el.custID
HAVING
MAX( el.eventID )
This is returned:
eventID custID date_event
346290 12345 2013-06-21 09:58:44
Here's the EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE el ref userID,custID,Composite custID 5 const 203 Using where
If I change the query to use HAVING MIN, the results don't change.. I should see a different eventID and date_event, as there are dozens of records matching the custID and userID.
SELECT
el.eventID,
el.custID,
el.date_event
FROM
event_log el
WHERE
el.custID = 12345 AND
el.userID=987
GROUP BY
el.custID
HAVING
MIN( el.eventID )
Same results as before:
eventID custID date_event
346290 12345 2013-06-21 09:58:44
No change.
This tells me I have another problem, but I am not seeing what that might be.
Some pointers would be appreciated.
Upvotes: 1
Views: 1123
Reputation: 686
You can use a group function in a statement containing no GROUP BY clause, but it would be equivalent to grouping on all rows. But I guess you're looking for the common syntax,
SELECT
MIN(el.eventID) AS `min_eventID`, --> Yes it is wrong :(
el.custID,
el.date_event
FROM
event_log el
WHERE
el.userID = 987
GROUP BY el.custID;
But disagreements are welcome .
I think I didn't show a solution fast enough... but maybe you're rather looking for the fastest solution.
Assuming field date_event defaults to CURRENT_TIMESTAMP (am I wrong?), ordering by date_event would be a waste of time (and money, thus).
I've made some tests with 20K rows and execution time was about 5ms.
SELECT STRAIGHT_JOIN y.*
FROM ((
SELECT MAX(eventId) as eventId
FROM event_log
WHERE userId = 987 AND custId = 12345
)) AS x
INNER JOIN event_log AS y
USING (eventId);
Maybe (possibly, who knows) you didn't get the straight_join thing; as documented on the scriptures, STRAIGHT_JOINs are similar to JOINs, except that the left table is always read before the right table. Sometimes it's useful.
For your specific situation, we're likely to filter to a certain eventID before (on table "x"), not to retrieve 99,99% useless rows from table "y".
Upvotes: 0
Reputation: 780655
SELECT
el.eventID,
el.custID,
el.date_event
FROM
event_log el
WHERE
el.custID = 12345 AND
el.userID=987 AND
el.eventID IN (SELECT MAX(eventID)
FROM event_log
WHERE custID = 12345
AND userID = 987)
Your query doesn't work because you misunderstand what HAVING
does. It evaluates the expression on each line of the result set, and keeps the rows where the expression evaluates to true. The expression MAX(el.eventID)
simply returns the maximum event ID selected by the query, it doesn't compare the current row to that event ID.
Another way is:
SELECT
el.eventID,
el.custID,
el.date_event
FROM
event_log el
WHERE
el.custID = 12345 AND
el.userID=987
ORDER BY eventID DESC
LIMIT 1
The more general form that works for multiple custID is:
SELECT el.*
FROM event_log el
JOIN (SELECT custID, max(date_event) maxdate
FROM event_log
WHERE userID = 987
GROUP BY custID) emax
ON el.custID = emax.custID AND el.date_event = emax.maxdate
WHERE el.userID = 987
Upvotes: 3