lanes123
lanes123

Reputation: 147

Select 50% of rows. evenly splitting between the text in a column

Suppose I have the below table called tblOrder (i included 20 rows but contains a lot more): I want to create a select query that would return me half the rows (select top 50 percent) but randomly spread the rows out between a, b, c and d. So in this case it will return 2 with a, 2 with b, 2 with c and 2 with d - the remaining 2 will then be 1 a and 1 b

enter image description here

Upvotes: 0

Views: 734

Answers (3)

Ardalan Shahgholi
Ardalan Shahgholi

Reputation: 12555

If you have just 4 Letter type (A,B,C,D) you can use below query. If you run it several times you will have different result becauce of NewId() function.

SELECT TOP(SELECT COUNT(*)/8 FROM tblOrder ) * FROM tblOrder WHERE tblOrder.letter = 'A' ORDER BY newid()
UNION ALL
SELECT TOP(SELECT COUNT(*)/8 FROM tblOrder ) * FROM tblOrder WHERE tblOrder.letter = 'B' ORDER BY newid()
UNION ALL
SELECT TOP(SELECT COUNT(*)/8 FROM tblOrder ) * FROM tblOrder WHERE tblOrder.letter = 'C' ORDER BY newid()
UNION ALL
SELECT TOP(SELECT COUNT(*)/8 FROM tblOrder ) * FROM tblOrder WHERE tblOrder.letter = 'D' ORDER BY newid()

But if you have more or less than 4 letter type or you do not know how many letter type you may have, you can use below query. If you run it several times you will have different result becauce of NewId() function.

WITH
CTE_A
AS
SELECT 
    *,
    ROW_NUMBER() OVER(PARTITION BY tblOrder.letter, ORDER BY newid()) AS RowLetter
FROM tblOrder   

SELECT 
    *
FROM CTE_A 
WHERE RowLetter / 2 = 0 

Upvotes: 1

Armin
Armin

Reputation: 127

The 50 % can be achieved by using a sub query like this.

select top (select count(*) / 2 from tblOrder ) * from tblOrder 

This would give you the first 50 % of entries it can get, note that this is not random but is depending on the sorting of your query.

Regarding the a, b, c and d. Did you mean evenly split between those letters or randomly?

Evenly could be achieved by the query every letter seperate and union those like so.

select top ((select count(*) / 2) / 4 from tblOrder ) * from tblOrder where tblOrder.letter = 'A'
union
select top ((select count(*) / 2) / 4 from tblOrder ) * from tblOrder where tblOrder.letter = 'B'
union
select top ((select count(*) / 2) / 4 from tblOrder ) * from tblOrder where tblOrder.letter = 'C'
union
select top ((select count(*) / 2) / 4 from tblOrder ) * from tblOrder where tblOrder.letter = 'D'

But note that this also wont get you random entries. If you run the query multiple times you get the same result.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269593

If you want a truly random sample, you can use:

select o.*
from (select o.*,
             ntile(2) over (order by newid()) as tile
      from tblOrder o
     ) o
where tile = 2;

If you want a stratified sample, which attempts to maintain the original distribution of the four values, you can use:

select o.*
from (select o.*,
             ntile(2) over (partition by col2 order by newid()) as tile
      from tblOrder o
     ) o
where tile = 1;

Upvotes: 2

Related Questions