Reputation: 8890
I'm having to return ~70,000 rows of 4 columns of INTs in a specific order and can only use very shallow caching as the data involved is highly volatile and has to be up to date. One property of the data is that it is often highly repetitive when it is in order.
I've started to look at various methods of reducing the row count in order to reduce network bandwidth and client side processing time/resources, but have not managed to find any kind of technique in T-SQL where I can 'compress' repetative rows down into a single row and a 'count' column. e.g.
prop1 prop2 prop3 prop4
--------------------------------
0 0 1 53
0 0 2 55
1 1 1 8
1 1 1 8
1 1 1 8
1 1 1 8
0 0 2 55
0 0 2 55
0 0 1 53
Into:
prop1 prop2 prop3 prop4 count
-----------------------------------------
0 0 1 53 1
0 0 2 55 1
1 1 1 8 4
0 0 2 55 2
0 0 1 53 1
I'd estimate that if this was possible, in many cases what would be a 70,000 row result set would be down to a few thousand at most.
Am I barking up the wrong tree here (is there implicit compression as part of the SQL Server protocol)?
Is there a way to do this (SQL Server 2005)?
Is there a reason I shouldn't do this?
Thanks.
Upvotes: 1
Views: 434
Reputation: 5421
70,000 rows of four integer columns is not really a worry for bandwidth on a modern LAN, unless you have many workstations executing this query concurrently; and on a WAN with more restricted bandwidth you could use DISTINCT to eliminate duplicate rows, an approach which would be frugal with your bandwidth but consume some server CPU. Again, however, unless you have a really overloaded server that is always performing at or near peak loads, this additional consumption would be a mere blip. 70,000 rows is next to nothing.
Upvotes: 0
Reputation: 31781
This will work, though it is painful to look at:
;WITH Ordering
AS
(
SELECT Prop1,
Prop2,
Prop3,
Prop4,
ROW_NUMBER() OVER (ORDER BY Y, X) RN
FROM Props
)
SELECT
CurrentRow.Prop1,
CurrentRow.Prop2,
CurrentRow.Prop3,
CurrentRow.Prop4,
CurrentRow.RN -
ISNULL((SELECT TOP 1 RN FROM Ordering O3 WHERE RN < CurrentRow.RN AND (CurrentRow.Prop1 <> O3.Prop1 OR CurrentRow.Prop2 <> O3.Prop2 OR CurrentRow.Prop3 <> O3.Prop3 OR CurrentRow.Prop4 <> O3.Prop4) ORDER BY RN DESC), 0) Repetitions
FROM Ordering CurrentRow
LEFT JOIN Ordering O2 ON CurrentRow.RN + 1 = O2.RN
WHERE O2.RN IS NULL OR (CurrentRow.Prop1 <> O2.Prop1 OR CurrentRow.Prop2 <> O2.Prop2 OR CurrentRow.Prop3 <> O2.Prop3 OR CurrentRow.Prop4 <> O2.Prop4)
ORDER BY CurrentRow.RN
The gist is the following:
Upvotes: 1
Reputation: 95133
You can use the count
function! This will require you to use the group by
clause, where you tell count
how to break up, or group
, itself. Gropu by
is used for any aggregate function in SQL.
select
prop1,
prop2,
prop3,
prop4,
count(*) as count
from
tbl
group by
prop1,
prop2,
prop3,
prop4,
y,
x
order by y, x
Update: The OP mentioned these are ordered by y
and x
, not part of the result set. In this case, you can still use y
and x
as part of the group by
.
Keep in mind that order means nothing if it doesn't have ordering columns, so in this case, we have to respect that with y
and x
in the group by
.
Upvotes: 2