J.Doe
J.Doe

Reputation: 13

Sum up following rows with same status

I am trying to sum up following rows if they have the same id and status.

The DB is running on a Windows Server 2016 and is a Microsoft SQL Server 14. I was thinking about using a self join, but that would only sum up 2 rows, or somehow use lead/lag.

Here is how the table looks like (Duration is the days between this row and the next, sorted by mod_Date, if they have the same id):

+-----+--------------+-------------------------+----------+
| ID  |    Status    |        mod_Date         | Duration |
+-----+--------------+-------------------------+----------+
|   1 | In Inventory | 2015-04-10 09:11:37.000 | 12       |
|   1 | Deployed     | 2015-04-22 10:13:35.000 | 354      |
|   1 | Deployed     | 2016-04-10 09:11:37.000 | 30       |
|   1 | In Inventory | 2016-05-10 09:11:37.000 | Null     |
|   2 | In Inventory | 2013-04-10 09:11:37.000 | 12       |
| ... | ...          | ...                     | ...      |
+-----+--------------+-------------------------+----------+

There can be several rows with the same status and id following each other not only two.

And what I want to get is:


+-----+--------------+-------------------------+----------+
| ID  |    Status    |        mod_Date         | Duration |
+-----+--------------+-------------------------+----------+
|   1 | In Inventory | 2015-04-10 09:11:37.000 | 12       |
|   1 | Deployed     | 2015-04-22 10:13:35.000 | 384      |
|   1 | In Inventory | 2016-05-10 09:11:37.000 | Null     |
|   2 | In Inventory | 2013-04-10 09:11:37.000 | 12       |
| ... | ...          | ...                     | ...      |
+-----+--------------+-------------------------+----------+

Upvotes: 0

Views: 67

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

This is an example of gaps and islands. In this case, I think the difference of row numbers suffices:

select id, status, max(mod_date) as mod_date, sum(duration) as duration
from (select t.*,
             row_number() over (partition by id, status order by mod_date) as seqnum_is,
             row_number() over (partition by id order by mod_date) as seqnum_i
      from t
     ) t
group by id, status, seqnum_i - seqnum_is;

The trick here is that the difference of two increasing sequences identifies "islands" of where the values are the same. This is rather mysterious the first time you see it. But if you run the subquery, you'll probably quickly see how this works.

Upvotes: 2

Related Questions