Reputation: 3
I'm no sql-expert, but came across this problem:
I have to retrieve data from Microsoft SQL 2008 server. It holds different measurement data from different probes, that don't have any recording intervals. Meaning that some probe can transfer data in the database once every week, another once every second. Probes are identified by id's (not unique), and the point is to retrieve only the last record from each id (probe). Table looks like this (last 5, order by SampleDateTime desc):
TagID SampleDateTime SampleValue QualityID
13 634720670797944946 112 192
23 634720670797944946 38.1 192
17 634720670797944946 107.5 192
14 634720670748012090 110.6 192
19 634720670748012090 99.7 192
I CAN'T modify the server or even the settings, am only authorized to do queries. And I'd need to retrieve the requested data on even intervals (say once every minute or so). There are over 100 probes (with different id's) of which about 40 need to be read. So I am guessing that if this could be done in a single query it could be way more efficient than to get each row in a separate query.
Using MySQL and a similar table got the desired result this way (suggestions for a better way highly appreciated!):
SELECT TagID,SampleDateTime,SampleValue FROM
(
SELECT TagID,SampleDateTime,SampleValue FROM measurements
WHERE TagID IN(101,102,103) ORDER BY SampleDateTime DESC
)
AS table1 GROUP BY TagID;
Thought that would do the trick (didn't manage with MAX() or DISTINCT or no matter what I tried), as it did, with the correct data even. But naturally it doesn't work in Ms SQL because of 'GROUP BY'.
Column 'table1.SampleValue' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I'm extremely stuck with this and so any insight would be more than welcome.
Upvotes: 0
Views: 718
Reputation: 69819
I am slightly confused as you have tagged MySQL and SQL-Server. For SQL-Server, I would use the ROW_NUMBER
function to assist:
SELECT m.TagID, m.SampleDateTime, m.SampleValue, m.QualityID
FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY TagID ORDER BY SampleDateTime DESC) [RowNumber]
FROM Measurements
) m
WHERE Rownumber = 1
The ROW_NUMBER
function does exactly what it says on the tin, gives each row a number based on criteria you provide. So in the example above PARTITION BY TagID
tells ROW_NUMBER
to start again at 1 each time a new TagID is encountered. ORDER BY SampleDateTime DESC
tells ROW_NUMBER
to start numbering the each TagID at the latest entry and work upwards to the earliest entry.
The reason your query failed is because MySQL allows implicit group by, meaning that because you have only specified GROUP BY TagID
any fields that are in the select list and not contained within an aggregate function will get the values of a "random" row assigned to them (the latest row in your case because you specified ORDER BY SampleDateTime DESC
in the subquery.
Just in case it is required the following should work in most DBMS and is a better way of producing a similar query to the one you have been running in MySQL:
SELECT m.TagID, m.SampleDateTime, m.SampleValue, m.QualityID
FROM Measurements m
INNER JOIN
( SELECT TagID, MAX(SampleDateTime) AS SampleDateTime
FROM Measurements
GROUP BY TagID
) MaxTag
ON MaxTag.TagID = m.TagID
AND MaxTag.SampleDateTime = m.SampleDateTime
Upvotes: 1