Reputation: 21
I've got a fairly straightforward (I thought) data request from a very long table in SQL Server.
The table TMLogs
contains thousands of data samples from machinery sensors connected to generators. These generators are connected to sensors that provide a snapshot of data on key generator performance components: RPM, Temperature, Sample Time, etc. The data snapshots are taken every 5 seconds, then stored in a SQL Server database. The generators are in different time zones, and the clocks aren't very accurate (+/- 10) minutes.
The columns look something like this:
Generator Name | Generator State | Fuel Tank Reading | Output (watts) | Sample Time (YYYY-DD-MM HH:MM:SS) | Engine Temp (F°) | Output Descriptor|....
My goal is to retrieve only the most recent sample that each generator has given.
Selecting the TOP
X values only gives the generators in the eastern time zone, or the ones that have system clocks that are ahead of official time.
So, given the criteria Generator Name
, how would I retrieve the newest data sample that each generator has stored?
Upvotes: 2
Views: 72
Reputation: 2313
I haven't fully tested this, but I'd try approaching it with something like this.
SELECT b.*
FROM
(
select max([sample time]),
[generator name] from tmlogs
group by [generator name]
) a
inner join tmlogs b
on a.time = b.[sample time] and a.[generator name] = b.[generator name]
Basically if you can get the results you want (first query, aliased as "a"), you can then just join the main table (aliased as "b"), on whatever columns from "a".
Upvotes: 1
Reputation: 5040
Here are two approaches you can take. Both use the sample data defined here:
-- Let's create a sample table to experiment with.
DECLARE @Sample TABLE
(
GeneratorId INT,
SampleTime DATETIME,
Measure1 INT
)
;
-- Populate it.
INSERT INTO @Sample
(
GeneratorId,
SampleTime,
Measure1
)
VALUES
(1, '2015-01-01 09:00:00.000', 100),
(1, '2015-01-01 09:05:00.000', 90),
(2, '2015-09-01 10:00:00.000', 100),
(2, '2015-09-01 10:05:00.000', 90)
;
This first example uses ROW_NUMBER. This windowed function is great for sequential numbering records. In this case the most recent sample for each generator is assigned a 1.
/* The inner query ranks our records.
* The outer query filters for the most recent.
* Windowed functions cannot be used in the where clause, hence the need for a subquery.
*/
SELECT
*
FROM
(
/* ROW_NUMBER assigns a rank to each record.
* Each GeneratorId is ranked separately, thanks
* to the PARTITION BY.
*/
SELECT
ROW_NUMBER() OVER(PARTITION BY GeneratorId ORDER BY SampleTime) AS RowNum,
GeneratorId,
SampleTime,
Measure1
FROM
@Sample
) AS r
WHERE
RowNum = 1
;
Or you could use a CTE. This allows us to first find the most recent sample, by generator. Then filter for just those.
/* We can also use a CTE to achive the same result.
* You could just as easily make the CTE a subquery
* in the main body.
*/
WITH MostRecent AS
(
/* This CTE finds the most recent sample for each
* generator. This is used to filer the main query.
*/
SELECT
GeneratorId,
MAX(SampleTime) AS MaxSampleTime
FROM
@Sample
GROUP BY
GeneratorId
)
SELECT
*
FROM
@Sample AS s
INNER JOIN MostRecent AS mr ON mr.GeneratorId = s.GeneratorId
AND mr.MaxSampleTime = s.SampleTime
;
This 2nd example could be reformatted as a subquery.
Upvotes: 3