Reputation: 53
I have seen quite a few solutions on this kind of problem (esp. this one SQL Select only rows with Max Value on a Column), but none of these seem to be appropriate:
I have the following table layout, a versioning of attachments, which are bound to entities:
TABLE attachments
+------+--------------+----------+----------------+---------------+
| id | entitiy_id | group_id | version_number | filename |
+------+--------------+----------+----------------+---------------+
| 1 | 1 | 1 | 1 | file1-1.pdf |
| 2 | 1 | 1 | 2 | file1-2.pdf |
| 3 | 1 | 2 | 1 | file2-1.pdf |
| 4 | 2 | 1 | 1 | file1-1.pdf |
| 5 | 2 | 1 | 2 | file1-2.pdf |
| 6 | 2 | 3 | 1 | file3-1.pdf |
+------+--------------+----------+----------------+---------------+
Output should be Max version number, grouped by group_id and entity_id, I'd only need a list for single entity_ids if that helps:
+------+--------------+----------+----------------+---------------+
| id | entitiy_id | group_id | version_number | filename |
+------+--------------+----------+----------------+---------------+
| 2 | 1 | 1 | 2 | file1-2.pdf |
| 3 | 1 | 2 | 1 | file2-1.pdf |
| 5 | 2 | 1 | 2 | file1-2.pdf |
| 6 | 2 | 3 | 1 | file3-1.pdf |
+------+--------------+----------+----------------+---------------+
What I have come up with is this self join one:
SELECT *
FROM `attachments` `attachments`
LEFT OUTER JOIN attachments t2
ON ( attachments.group_id = t2.group_id
AND attachments.version_number < t2.version_number )
WHERE ( t2.group_id IS NULL )
AND ( `t2`.`id` = 1 )
GROUP BY t2.group_id
But this one only works if different entities do not share same group numbers. This, unfortunately is necessary.
I came across a working solution while creating a view, but this is not supported in my current setup.
Any ideas are highly appreciated. Thanks!
Upvotes: 5
Views: 3974
Reputation: 1096
You can also solve this using a high performance Common Table Expression (CTE).
WITH CTE AS
(
SELECT entitiy_id, group_id, version_number, filename,
ROW_NUMBER() OVER (PARTITION BY entitiy_id, group_id ORDER BY version_number DESC) as RowNum
FROM attachments
)
SELECT entitiy_id, group_id, version_number, filename
FROM CTE
WHERE RowNum = 1
Or
SELECT T.entitiy_id, T.group_id, T.version_number, T.filename
FROM (SELECT entitiy_id, group_id, version_number, filename,
ROW_NUMBER() OVER (PARTITION BY entitiy_id, group_id ORDER BY version_number DESC) as RowNum
FROM attachments
) as T
WHERE RowNum = 1
Upvotes: 0
Reputation: 43494
Try this:
select t1.* from attachments t1
left join attachments t2
on t1.entity_id = t2.entity_id and t1.group_id = t2.group_id and
t1.version_number < t2.version_number
where t2.version_number is null
Upvotes: 3
Reputation: 66727
This would do the trick:
select a1.* from attachments a1
inner join ( select entitiy_id, group_id, max(version_number) as version_number
from attachments
group by entitiy_id, group_id) a2 on a1.entitiy_id = a2.entitiy_id and
a1.group_id = a2.group_id and
a1.version_number = a2.version_number
Upvotes: 2
Reputation: 7025
This would work for selecting all
SELECT attachments.*
FROM (
SELECT entitiy_id, group_id, MAX(version_number) AS max_version
FROM attachments
GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version
Expanding this to just look for a single entitiy_id would simply involve adding a WHERE into the subquery, so this would give
SELECT attachments.*
FROM (
SELECT entitiy_id, group_id, MAX(version_number) AS max_version
FROM attachments
WHERE entitity_id = [[YOUR ENTITIY ID HERE]]
GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version
If you wish to make sure that this continues to operate fast as the number of rows increases, I would advise you make sure you add a key onto attachments with the rows (entitiy_id, group_id, max_version)
as then the subquery would be able to rely on that thus ensuring it doesn't lock up the table.
Upvotes: 2