Manoj
Manoj

Reputation: 83

How to count consecutive duplicates in a table?

I have below question:
Want to find the consecutive duplicates

SLNO   NAME     PG   
1       A1      NO                   
2       A2      YES              
3       A3      NO           
4       A4      YES          
6       A5      YES          
7       A6      YES          
8       A7      YES      
9       A8      YES  
10      A9      YES
11      A10     NO 
12      A11     YES 
13      A12     NO 
14      A14     NO

We will consider the value of PG column and I need the output as 6 which is the count of maximum consecutive duplicates.

Upvotes: 6

Views: 1766

Answers (6)

Boneist
Boneist

Reputation: 23578

Just for completeness, here's the actual Tabibitosan method:

with sample_data as (select 1 slno, 'A1' name, 'NO' pg from dual union all 
                     select 2 slno, 'A2' name, 'YES' pg from dual union all
                     select 3 slno, 'A3' name, 'NO' pg from dual union all
                     select 4 slno, 'A4' name, 'YES' pg from dual union all
                     select 6 slno, 'A5' name, 'YES' pg from dual union all
                     select 7 slno, 'A6' name, 'YES' pg from dual union all
                     select 8 slno, 'A7' name, 'YES' pg from dual union all
                     select 9 slno, 'A8' name, 'YES' pg from dual union all
                     select 10 slno, 'A9' name, 'YES' pg from dual union all
                     select 11 slno, 'A10' name, 'NO' pg from dual union all
                     select 12 slno, 'A11' name, 'YES' pg from dual union all
                     select 13 slno, 'A12' name, 'NO' pg from dual union all
                     select 14 slno, 'A14' name, 'NO' pg from dual)
-- end of mimicking a table called "sample_data" containing your data; see SQL below:
select max(cnt) max_pg_in_queue
from   (select   count(*) cnt
        from     (select slno,
                         name,
                         pg,
                         row_number() over (order by slno)
                           - row_number() over (partition by pg
                                                order by slno) grp
                  from   sample_data)
        where    pg = 'YES'
        group by grp);

MAX_PG_IN_QUEUE
---------------
              6

Upvotes: 1

MT0
MT0

Reputation: 167822

Only requiring a single aggregation query and no joins (the rest of the calculation can be done with ROW_NUMBER, LAG and LAST_VALUE):

SELECT MAX( num_before_in_queue ) AS max_sequential_in_queue
FROM   (
  SELECT rn - LAST_VALUE( has_changed ) IGNORE NULL OVER ( ORDER BY ROWNUM ) + 1
           AS num_before_in_queue
  FROM   (
    SELECT pg,
           ROW_NUMBER() OVER ( ORDER BY slno ) AS rn,
           CASE pg WHEN LAG( pg ) OVER ( ORDER BY slno )
                   THEN NULL
                   ELSE ROW_NUMBER() OVER ( ORDER BY sl_no )
                   END AS change
    FROM   table_name
  )
  WHERE  pg = 'Y'
);

Upvotes: 0

Florin Ghita
Florin Ghita

Reputation: 17643

It can be done with Tabibitosan method. Run this, to understand it:

with a as(
select 1 slno, 'A' pg from dual union all
select 2 slno, 'A' pg from dual union all
select 3 slno, 'B' pg from dual union all
select 4 slno, 'A' pg from dual union all
select 5 slno, 'A' pg from dual union all
select 6 slno, 'A' pg from dual 
)
select slno, pg, newgrp, sum(newgrp) over (order by slno) grp
from( 
    select slno, 
           pg, 
           case when pg <> nvl(lag(pg) over (order by slno),1) then 1 else 0 end newgrp
    from a
    );

Newgrp means a new group is found.

Result:

SLNO PG NEWGRP GRP
1    A  1      1
2    A  0      1
3    B  1      2
4    A  1      3
5    A  0      3
6    A  0      3

Now, just use a group by with count, to find the group with maximum number of occurrences:

with a as(
select 1 slno, 'A' pg from dual union all
select 2 slno, 'A' pg from dual union all
select 3 slno, 'B' pg from dual union all
select 4 slno, 'A' pg from dual union all
select 5 slno, 'A' pg from dual union all
select 6 slno, 'A' pg from dual 
),
b as(
select slno, pg, newgrp, sum(newgrp) over (order by slno) grp
from( 
    select slno, pg, case when pg <> nvl(lag(pg) over (order by slno),1) then 1 else 0 end newgrp
    from a
    )
)
select max(cnt)
from (
    select grp, count(*) cnt
    from b
    group by grp
    );

Upvotes: 6

Arkadiusz Łukasiewicz
Arkadiusz Łukasiewicz

Reputation: 6346

with test as (
select 1 slno,'A1' name ,'NO' pg from dual union all 
select 2,'A2','YES' from dual union all
select 3,'A3','NO' from dual union all
select 4,'A4','YES' from dual union all
select 6,'A5','YES' from dual union all
select 7,'A6','YES' from dual union all
select 8,'A7','YES' from dual union all
select 9,'A8','YES' from dual union all
select 10,'A9','YES' from dual union all
select 11,'A10','NO' from dual union all
select 12,'A11','YES' from dual union all
select 13,'A12','NO' from dual union all
select 14,'A14','NO' from dual),
consecutive as (select row_number() over(order by slno) rr, x.* 
              from test x)
select x.* from Consecutive x
  left join Consecutive y on x.rr = y.rr+1 and x.pg = y.pg
  where y.rr is not null
  order by x.slno 

And you can control output with condition in where.

where y.rr is not null query returns duplicates

where y.rr is null query returns "distinct" values.

Upvotes: 1

Carlo
Carlo

Reputation: 1579

SELECT MAX(consecutives) -- Block 1
FROM (
    SELECT t1.pg, t1.slno, COUNT(*) AS consecutives -- Block 2
    FROM test t1 INNER JOIN test t2 ON t1.pg = t2.pg
    WHERE t1.slno <= t2.slno
      AND NOT EXISTS (
        SELECT *  -- Block 3
        FROM test t3 
        WHERE t3.slno > t1.slno
          AND t3.slno < t2.slno
          AND t3.pg  != t1.pg
    )    
    GROUP BY t1.pg, t1.slno
);

The query calculates the result in following way:

  • Extract all couples of records that don't have a record with different value of PG in between (blocks 2 and 3)
  • Group them by PG value and starting SLNO value -> this counts the consecutive values for any [PG, (starting) SLNO] couple (block 2);
  • Extract Maximum value from query 2 (block 1)

Note that the query may be simplified if the slno field in table contains consecutive values, but this seems not your case (in your example record with SLNO = 5 is missing)

Upvotes: 0

Sweetspot
Sweetspot

Reputation: 91

Try to use row_number()

select
    SLNO,
    Name,
    PG,
    row_number() over (partition by PG order by PG) as 'Consecutive'
from
    <table>
order by
    SLNO,
    NAME,
    PG

This is should work with minor tweaking.

--EDIT--

Sorry, partiton by PG. The partitioning tells the row_number when to start a new sequence.

Upvotes: -3

Related Questions