konrad_pe
konrad_pe

Reputation: 1190

SQL Server : how to select a fixed amount of rows (select every x-th value)

A short description: I have a table with data that is updated over a certain time period. Now the problem is, that - depending on the nature of the sensor which sends the data - in this time period there could be either 50 data sets or 50.000. As I want to visualize this data (using ASP.NET / c#), for a first preview I would like to SELECT just 1000 values from the table.

I already have an approach doing this: I count the rows in the time period of interest, with a simple "where" clause to specify the sensor-id, save it as a variable in SQL, and then divide the count() by 1000. I've tried it in MS Access, where it works just fine:

set @divider = select count(*) from table where [...]

SELECT (Int([RowNumber]/@divider)), First(Value)
FROM myTable
GROUP BY (Int([RowNumber]/@divider));

The trick in Access was, that I simply have a data field ("RowNumber"), which is my PK/ID, and goes from 0 up. I tried to accomplish that in SQL Server using the ROW_NUMBER() method, which works more or less. I've got the right syntax for the method, but I can not use the GROUP BY statement

Windowed functions can only appear in the SELECT or ORDER BY clauses.

meaning ROW_NUMBER() can't be in the GROUP BY statement.

Now I'm kinda stuck. I've tried to save the ROW_NUMBER value into a char or a separate column, and GROUP BY it later on, but I couldn't get it done. And somehow I start to think, that my strategy might have its weaknesses ...? :/

To clarify once more: I don't need to SELECT TOP 1000 from my table, because this would just mean that I select the first 1000 values (depending on the sorting). I need to SELECT every x-th value, while I can compute the x (and I could even round it to an INT, if that would help to get it done). I hope I was able to describe the problem understandable ...

This is my first post here on StackOverflow, I hope I didn't forget anything essential or important, if you need any further information (table structure, my queries so far, ...) please don't hesitate to ask. Any help or hint is highly appreciated - thanks in advance! :)


Update: SOLUTION! Big thanks to https://stackoverflow.com/users/52598/lieven!!!

Here is how I did it in the end:

I declare 2 variables - I count my rows and SET it into the first var. Then I use ROUND() on the just assigned variable, and divide it by 1000 (because in the end I want ABOUT 1000 values!). I split this operation into 2 variables, because if I used the value from the COUNT function as basis for my ROUND operation, there were some mistakes.

declare @myvar decimal(10,2) 
declare @myvar2 decimal(10,2)

set @myvar = (select COUNT(*)
from value_table
where channelid=135 and myDate >= '2011-01-14 22:00:00.000' and myDate <= '2011-02-14 22:00:00.000'
)

set @myvar2 = ROUND(@myvar/1000, 0)

Now I have the rounded value, which I want to be my step-size (take every x-th value -> this is our "x" ;)) stored in @myvar2. Next I will subselect the data of the desired timespan and channel, and add the ROW_NUMBER() as column "rn", and finally add a WHERE-clause to the outer SELECT, where I divide the ROW_NUMBER through @myvar2 - when the modulus is 0, the row will be SELECTed.

select * from
(
select (ROW_NUMBER() over (order by id desc)) as rn, myValue, myDate
from value_table
where channel_id=135 and myDate >= '2011-01-14 22:00:00.000' and myDate<= '2011-02-14 22:00:00.000'
) d
WHERE rn % @myvar2 = 0

Works like a charm - once again all my thanks to https://stackoverflow.com/users/52598/lieven, see the comment below for the original posting!

Upvotes: 4

Views: 4385

Answers (2)

Lieven Keersmaekers
Lieven Keersmaekers

Reputation: 58431

In essence, all you need to do to select the x-th value is retain all rows where the modulus of the rownumber divided by x is 0.

WHERE rn % @x_thValues = 0

Now to be able to use your ROW_NUMBER's result, you'll need to wrap the entire statement into in a subselect

SELECT  *
FROM    (
            SELECT  *
                    , rn = ROW_NUMBER() OVER (ORDER BY Value)
            FROM    DummyData
        ) d
WHERE   rn % @x_thValues = 0                    

Combined with a variable to what x-th values you need, you might use something like this testscript

DECLARE @x_thValues INTEGER = 2

;WITH DummyData AS (SELECT * FROM (VALUES (1), (2), (3), (4)) v (Value))
SELECT  *
FROM    (
            SELECT  *
                    , rn = ROW_NUMBER() OVER (ORDER BY Value)
            FROM    DummyData
        ) d
WHERE   rn % @x_thValues = 0                    

Upvotes: 6

Jānis
Jānis

Reputation: 2266

One more option to consider:

Select Top 1000 * 
From dbo.SomeTable 
Where ....
Order By NewID()

but to be honest- like the previous answer more than this one. The question could be about performance..

Upvotes: 0

Related Questions