Valentin Grigorovici
Valentin Grigorovici

Reputation: 31

group consecutive rows

I have the following table called Phases. It describes DRAM operations:

Transact    PhaseName   TransactionBegin
1           REQ         0   
1           RESP        25  
2           REQ         5
2           RESP        30
10          REQ         50
10          RESP        105
11          REQ         55
11          RESP        115
21          REQ         60
21          RESP        120
22          REQ         65
22          RESP        125
23          REQ         70
23          RESP        130
24          REQ         75
24          RESP        140
37          REQ         200
37          RESP        240
38          REQ         205
38          RESP        245
...

I need to find the time between the first REQ and last RESP for each group. A group is where all Transact are consecutive.

TransactGroup   Period
(1..2)          30
(10..11)        65
(21..24)        80
(37..38)        45

I would then be great if I could find the average of Period for: 1) all groups that count 2 transactions, 2) all groups that count 6 transactions.

Upvotes: 3

Views: 361

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269623

I would approach this differently. First I would summarize the groups by TransAct and add an enumeration column. The difference between this column and Transact provides a grouping that you are looking for:

with p as (
      select Transact,
             max(case when PhaseName = 'REQ' then TransactionBegin end) as req,
             max(case when PhaseName = 'RESP' then TransactionBegin end) as resp
      from phases
      group by Transact
     ),
     pn as (
      select pn.*, (select count(*) from p p2 where p2.Transact <= p.Transact) as seqnum
      from p
     )
select min(Transact), max(Transact), max(resp) - min(resp)
from pn
group by (Transact - seqnum);

EDIT:

Without a with clause, the query loses a bit of its elegance. Here is what it looks like:

select min(Transact), max(Transact), max(resp) - min(resp)
from (select pn.*,
             (select count(distinct p2.Transact)
              from phases p2
              where p2.Transact <= p.Transact
             ) as seqnum
      from (select Transact,
                   max(case when PhaseName = 'REQ' then TransactionBegin end) as req,
                   max(case when PhaseName = 'RESP' then TransactionBegin end) as resp
            from phases p
            group by Transact
           ) p
     ) p
group by (Transact - seqnum);

Note that I changed the subquery slightly to use count(distinct). The subquery now operates on the main table, and it needs to count distinct ids rather than all rows to get the proper enumeration.

Upvotes: 4

Jayvee
Jayvee

Reputation: 10875

this should work as long as the resp begin of lower groups are lower than the resp begin of later groups, which seems to be the case according to the example:

select
    t.transact groupstart, 
    min(tend) groupend, 
    min(respend)-transactionBegin Period
from t
join 
    (
    select transact tend, transactionbegin respend from t 
    where t.phasename='RESP' 
    and not exists 
    (select 1 from t t1 where t1.transact=t.transact+1) 
    ) t2 
on t.transact<t2.tend 
where t.phasename='REQ'
and not exists
(select 1 from t t1 where t1.transact=t.transact-1) 
group by transact

t is your table; t1 and t2 are aliases on the subqueries

SQLFiddle

with this output as a subquery the counts and averages will be a trivial sql.

Upvotes: 0

CL.
CL.

Reputation: 180030

It would be possible to compute the transaction group dynamically, but that would make the queries insanely complex. Better add this as a new column:

ALTER TABLE Phases
ADD COLUMN TransactGroup;

UPDATE Phases
SET TransactGroup = (SELECT Transact
                     FROM Phases AS First
                     WHERE First.Transact <= Phases.Transact
                       AND NOT EXISTS (SELECT 1
                                       FROM Phases AS Previous
                                       WHERE Transact = First.Transact - 1)
                     ORDER BY Transact DESC
                     LIMIT 1)

As group identifer, we use the first Transact in the group. A row is the first row in a group if there is no row with the previous Transact number. To find the group's first row from some arbitrary row, we search for the latest row that is a first row, but is not after this row.


The queries can then be done with simple GROUP BYs (the CASE expressions make unwanted values NULL, which is ignored by MIN/MAX):

SELECT TransactGroup,
       MAX(CASE PhaseName WHEN 'RESP' THEN TransactionBegin END) -
       MIN(CASE PhaseName WHEN 'REQ'  THEN TransactionBegin END) AS Period,
       MAX(Transact) - MIN(Transact) + 1 AS TransactCount
FROM Phases
GROUP BY TransactGroup
SELECT TransactCount,
       AVG(Period)
FROM (... the previous query ...)
WHERE TransactCount IN (2, 6)
GROUP BY TransactCount

Upvotes: 0

Related Questions