Reputation: 21
I have a table 'content' with the following fields:
id (an unique, increasing identifier)
user_id
content1
content2
content3
...
The table can contain data from the same or different user_id
. I am trying to select data sorted by one of the content
fields. However I only want to select data from distinct 'user_id' and always take the user's latest entry (so the highest id
value). I cannot simply group by user_id
because grouping happens before sorting.
This is what I am doing at the moment:
SELECT *
FROM `content`
WHERE `content`.`id` = (
SELECT `id`
FROM `content` as `alt`
WHERE `alt`.`user_id` = `content`.`id`
ORDER BY `id` DESC
LIMIT 1 )
ORDER BY content1 DESC
It works but once the table gets bigger, the performance becomes too slow. Can someone give me an advice how to improve this query?
Upvotes: 2
Views: 108
Reputation: 325
you needs 2 step approach not 1 complete SQL for performance.
create temporary table
create temporary table id_Temp as
SELECT id
FROM content
as alt
WHERE alt
.user_id
= content
.id
select data from temporary table
SELECT *
FROM content
WHERE content
.id
= ( select id from id_temp ORDER BY id
DESC limit 1)
ORDER BY content1 DESC
in my experience, this 2 step approach make constant response time. with this approach, 2.5 mil pages/day web site works well with cheap 2 Liunx BOX.
Upvotes: 0
Reputation: 23125
The reason why it's slow with more data is because your subquery is executing for each row in the content
table.
This solution should be much quicker as the subselect will only execute once, where the result of that subselect is then joined to the same table - utlizing indexes:
SELECT b.*
FROM
(
SELECT MAX(id) AS maxid
FROM content
GROUP BY user_id
) a
INNER JOIN content b ON a.maxid = b.id
ORDER BY b.content1 DESC
Upvotes: 3