Reputation: 787
I have the following table in my MySQL db:
Table temperature
id - sensor_id - value - created_at
1 1 4.5 04-11-2014
2 1 2.2 05-11-2014
3 1 3.3 06-11-2014
4 2 4.5 04-11-2014
5 2 2.2 05-11-2014
6 2 3.3 06-11-2014
What I am looking to do is get the LATEST N rows for each sensor_id.
I managed to find many various solutions, but most of them include very inefficient joins, that in my case are not adequate, as I have 1million+ rows and the query is extremely slow.
The closest I could get to an efficient query is this:
set @num := 0, @sensor_id:= '';
select id, sensor_id, value, created_at,
@num := if(@sensor_id = sensor_id, @num + 1, 1) as row_number,
@sensor_id := sensor_id as dummy
from temperature
group by id, sensor_id, value, created_at
having row_number <= 2;
This query comes from this article http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ , but the problem is that it takes the FIRST N, not the LATEST N rows. How can I introduce an ORDER BY, in order to get the latest, instead of the first N rows?
The desired result should look like this, if you query for the latest 2 rows:
id - sensor_id - value - created_at
2 1 2.2 05-11-2014
3 1 3.3 06-11-2014
5 2 2.2 05-11-2014
6 2 3.3 06-11-2014
Upvotes: 3
Views: 1320
Reputation: 21513
If the joins are inefficient due to the number of rows then using a user variable is also likely to be inefficient as the query needs to check every single row.
If you process the results a little after returning them to get them in the format you want then there is another option.
SELECT sensor_id, SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(':', id, sensor_id, value, created_at) ORDER BY created_at DESC SEPARATOR '#'). '#', 4)
FROM temperature
GROUP BY sensor_id
This is using CONCAT_WS to roll all the values in the row together, separated with a ":". It then uses GROUP_CONCAT to concatenate all these values together for a single sensor id, separated with a # in descending date order (assuming the date is in a real date format, not a text dd-mm-yyyy format). Finally SUBSTRING_INDEX is used to get only the last N rows of data (in this case I have just used 4). You can easily use other delimiters if the data you are accessing contains any ":" or "#" characters.
Once this is returned you would need to split each returned row back into its separate fields.
Note that there is a max length for a GROUP_CONCAT result which (I think) defaults to 1024 characters. This can be changed, but depending on the amount of data and the number of rows required may not be an issue.
Upvotes: 0
Reputation: 13425
you can order the results and then apply the row_number logic
set @num := 0, @sensor_id:= '';
select *,
@num := if(@sensor_id = sensor_id, @num + 1, 1) as row_number,
@sensor_id := sensor_id as dummy
from
(select id, sensor_id, value, created_at
from temperature
order by sensor_id, created_at desc) T
group by id, sensor_id, value, created_at
having row_number <= 2;
Upvotes: 3