MRBags1964
MRBags1964

Reputation: 21

SQL Server - How to retrieve a single row given criteria in a column?

I've got a fairly straightforward (I thought) data request from a very long table in SQL Server.

The table TMLogs contains thousands of data samples from machinery sensors connected to generators. These generators are connected to sensors that provide a snapshot of data on key generator performance components: RPM, Temperature, Sample Time, etc. The data snapshots are taken every 5 seconds, then stored in a SQL Server database. The generators are in different time zones, and the clocks aren't very accurate (+/- 10) minutes.

The columns look something like this:

Generator Name | Generator State | Fuel Tank Reading | Output (watts) | Sample Time (YYYY-DD-MM HH:MM:SS) | Engine Temp (F°) | Output Descriptor|....

My goal is to retrieve only the most recent sample that each generator has given.

Selecting the TOP X values only gives the generators in the eastern time zone, or the ones that have system clocks that are ahead of official time.

So, given the criteria Generator Name, how would I retrieve the newest data sample that each generator has stored?

Upvotes: 2

Views: 72

Answers (2)

Eric
Eric

Reputation: 2313

I haven't fully tested this, but I'd try approaching it with something like this.

SELECT b.*
FROM
(
select max([sample time]), 
[generator name] from tmlogs 
group by [generator name]
) a
inner join tmlogs b
on a.time = b.[sample time] and a.[generator name] = b.[generator name]

Basically if you can get the results you want (first query, aliased as "a"), you can then just join the main table (aliased as "b"), on whatever columns from "a".

Upvotes: 1

David Rushton
David Rushton

Reputation: 5040

Here are two approaches you can take. Both use the sample data defined here:

-- Let's create a sample table to experiment with.
DECLARE @Sample TABLE
    (
        GeneratorId     INT,
        SampleTime      DATETIME,
        Measure1        INT         
    )
;

-- Populate it.
INSERT INTO @Sample
    (
        GeneratorId,
        SampleTime,
        Measure1
    )
VALUES
    (1, '2015-01-01 09:00:00.000', 100),
    (1, '2015-01-01 09:05:00.000', 90),
    (2, '2015-09-01 10:00:00.000', 100),
    (2, '2015-09-01 10:05:00.000', 90)
;

This first example uses ROW_NUMBER. This windowed function is great for sequential numbering records. In this case the most recent sample for each generator is assigned a 1.

/* The inner query ranks our records.
 * The outer query filters for the most recent.
 * Windowed functions cannot be used in the where clause, hence the need for a subquery.
 */
SELECT
    *
FROM
    (
        /* ROW_NUMBER assigns a rank to each record.
         * Each GeneratorId is ranked separately, thanks 
         * to the PARTITION BY.
         */
        SELECT
            ROW_NUMBER() OVER(PARTITION BY GeneratorId ORDER BY SampleTime) AS RowNum,
            GeneratorId,
            SampleTime,
            Measure1
        FROM
            @Sample
    ) AS r
WHERE
    RowNum = 1
;

Or you could use a CTE. This allows us to first find the most recent sample, by generator. Then filter for just those.

/* We can also use a CTE to achive the same result.
 * You could just as easily make the CTE a subquery
 * in the main body.
 */
WITH MostRecent AS
    (
        /* This CTE finds the most recent sample for each
         * generator.  This is used to filer the main query.
         */
        SELECT
            GeneratorId,
            MAX(SampleTime) AS MaxSampleTime
        FROM
            @Sample
        GROUP BY
            GeneratorId
    )
SELECT
    *
FROM
    @Sample AS s
        INNER JOIN MostRecent AS mr     ON  mr.GeneratorId      = s.GeneratorId
                                        AND mr.MaxSampleTime    = s.SampleTime
;

This 2nd example could be reformatted as a subquery.

Upvotes: 3

Related Questions