Reputation: 5242
Sorry for bad topic but I wasn't sure what to call it..
I have a table looking like this:
+-----++-----+
| Id ||Count|
+-----++-----+
| 1 || 1 |
+-----++-----+
| 2 || 5 |
+-----++-----+
| 3 || 8 |
+-----++-----+
| 4 || 3 |
+-----++-----+
| 5 || 6 |
+-----++-----+
| 6 || 8 |
+-----++-----+
| 7 || 3 |
+-----++-----+
| 8 || 1 |
+-----++-----+
I'm trying to make a select from this table where every time the SUM of row1 + row2 + row3 (etc) reaches 10, then it's a "HIT", and the count starts over again.
Requested output:
+-----++-----++-----+
| Id ||Count|| HIT |
+-----++-----++-----+
| 1 || 1 || N | Count = 1
+-----++-----++-----+
| 2 || 5 || N | Count = 6
+-----++-----++-----+
| 3 || 8 || Y | Count = 14 (over 10)
+-----++-----++-----+
| 4 || 3 || N | Count = 3
+-----++-----++-----+
| 5 || 6 || N | Count = 9
+-----++-----++-----+
| 6 || 8 || Y | Count = 17 (over 10..)
+-----++-----++-----+
| 7 || 3 || N | Count = 3
+-----++-----++-----+
| 8 || 1 || N | Count = 4
+-----++-----++-----+
How do I do this, and with best performance? I have no idea..
Upvotes: 5
Views: 196
Reputation: 1271003
You can't do this using window/analytic functions, because the breakpoints are not known in advance. Sometimes, it is possible to calculate the breakpoints. However, in this case, the breakpoints depend on a non-linear function of the previous values (I can't think of a better word than "non-linear" right now). That is, sometimes adding "1" to an earlier value has zero effect on the calculation for the current row. Sometimes it has a big effect. The implication is that the calculation has to start at the beginning and iterate through the data.
A minor modification to the problem would be solvable using such functions. If the problem were, instead, to carry over the excess amount for each group (instead of restarting the sum), the problem would be solvable using cumulative sums (and some other trickery).
Recursive queries (which others have provided) or a sequential operation is the best way to approach this problem. Unfortunately, it doesn't have a set-based method for solving it.
Upvotes: 5
Reputation: 7097
How about this using Running Totals:
DECLARE @Data TABLE(
Id INT
,SubTotal INT
)
INSERT INTO @Data
VALUES(1, 5)
INSERT INTO @Data
VALUES(2, 3)
INSERT INTO @Data
VALUES(3, 4)
INSERT INTO @Data
VALUES(4, 4)
INSERT INTO @Data
VALUES(5, 7)
DECLARE @RunningTotal INT = 0
DECLARE @HitCount INT = 0
SELECT
@RunningTotal = CASE WHEN @RunningTotal < 10 THEN @RunningTotal + SubTotal ELSE SubTotal END
,@HitCount = @HitCount + CASE WHEN @RunningTotal >= 10 THEN 1 ELSE 0 END
FROM @Data ORDER BY Id
SELECT @HitCount -- Outputs 2
Having re-read the question I see this does not meet the required output - I'll leave the answer as it may be of some use to someone looking for an example of a running total solution to this type of problem that doesn't need each row tagged with a Y or an N.
Upvotes: 1
Reputation: 36523
I'm really hoping someone can show us if it's possible to do this using straight-forward window functions. That's the real challenge.
In the meantime, here is how I would do it using recursion. This handles gaps in the sequence, and handles the edge case of the first row already being >= 10
.
I also added the maxrecursion
hint to remove the default recursion limit. But I honestly don't know how well it will run with larger amounts of data.
with NumberedRows as (
select Id, Cnt,
row_number() over(order by id) as rn
from CountTable
), RecursiveCTE as (
select Id, Cnt, rn,
case when Cnt >= 10 then 0 else Cnt end as CumulativeSum,
case when Cnt >= 10 then 'Y' else 'N' end as hit
from NumberedRows
where rn = 1
union all
select n.Id, n.Cnt, n.rn,
case when (n.Cnt + r.CumulativeSum) >= 10 then 0 else n.Cnt + r.CumulativeSum end as CumulativeSum,
case when (n.Cnt + r.CumulativeSum) >= 10 then 'Y' else 'N' end as hit
from RecursiveCTE r
join NumberedRows n
on n.rn = r.rn + 1
)
select Id, Cnt, hit
from RecursiveCTE
order by Id
option (maxrecursion 0)
Upvotes: 1
Reputation: 6205
You could use Recursive Queries
Please note the following query assuming the id value are all in sequence, otherwise, please use ROW_NUMBER()
to create a new id
WITH cte AS (
SELECT id, [Count], [Count] AS total_count
FROM Table1
WHERE id = 1
UNION ALL
SELECT t2.id,t2.[Count], CASE WHEN t1.total_count >= 10 THEN t2.[Count] ELSE t1.total_count + t2.[Count] END
FROM Table1 t2
INNER JOIN cte t1
ON t2.id = t1.id + 1
)
SELECT *
FROM cte
ORDER BY id
Upvotes: 3