Jacob Eckel
Jacob Eckel

Reputation: 1641

PostgreSQL - How to select the first consecutive group having same value

I have a table with pk and dept columns:

pk dept
-------
27  A
29  A
30  B
31  B
33  A

I need to select the first consecutive group, that is the first successive set of rows all having the same dept value when the table is ordered by pk, i.e. the expected result is:

pk dept
-------
27  A
29  A

In my example there are 3 consecutive groups (AA, BB and A). The size of a group is unlimited (can be more than 2).

Upvotes: 1

Views: 1260

Answers (4)

cmoron
cmoron

Reputation: 34

Its a little bit complex and probably, the permformance poor, but you can achieve what you want with the code below. There are four operations:

  1. The first one is where we obtain the base order and base group ids for the next operation.
  2. In the sencond operation we make the trick computing an unique group id for each group
  3. In the third operation, where are spreading the unique group id over the rows of each group.
  4. Finally, we compute a consecutive group id for each group to allow the discretionary selection of groups, so we only have to filter by the group number we want to obtain.

Hope this helps.

SELECT fourthOperation.pk,
       fourthOperation.dept 
 FROM (SELECT thirdOperation.pk,
              thirdOperation.dept,
              DENSE_RANK() OVER (ORDER BY thirdOperation.spreadedIdGroup) denseIdGroup
         FROM (SELECT secondOperation.*, 
                      NVL(idGroup, LAG(secondOperation.idGroup IGNORE NULLS) OVER (ORDER BY secondOperation.numRow)) spreadedIdGroup
              FROM (SELECT firstOperation.*,
                           CASE WHEN LAG(firstOperation.rankRow) OVER (ORDER BY firstOperation.numRow) = firstOperation.rankRow
                                THEN NULL
                                ELSE firstOperation.numRow
                                 END idGroup
                       FROM (SELECT yourTable.*, 
                                    ROW_NUMBER() OVER (ORDER BY PK)   AS numRow, 
                                    DENSE_RANK() OVER (ORDER BY DEPT) AS rankRow
                               FROM ABORRAR yourTable) firstOperation) secondOperation ) thirdOperation) fourthOperation
 WHERE fourthOperation.denseIdGroup = 1                                   

Upvotes: 0

Fabian Pijcke
Fabian Pijcke

Reputation: 3210

The following query should do what you want (I named your table tx):

SELECT *
FROM tx t1
WHERE NOT EXISTS (
  SELECT *
  FROM tx t2
  WHERE t2.dept <> t1.dept
    AND t2.pk < t1.pk);

The idea is to look for tuples such that no tuple with a lesser pk and a different department exists.

  • The first two A tuples are kept;
  • The B tuples are dropped because of the first two A tuples;
  • The last A tuple is dropped because of the B tuples.

Upvotes: 4

Abelisto
Abelisto

Reputation: 15614

Remember about stored functions. Unlike to using window functions its allows to avoid the reading of the whole table:

--drop function if exists foo();
--drop table if exists t;
create table t(pk int, dep text);
insert into t values(27,'A'),(29,'A'),(30,'B'),(31,'B'),(33,'A');

create function foo() returns setof t language plpgsql as $$
declare
  r t;
  p t;
begin
  for r in (select * from t order by pk) loop
    if p is null then
      p := r;
    end if;
    exit when p.dep is distinct from r.dep;
    return next r;
  end loop;
  return;
end $$;

select * from foo();

Upvotes: 0

wind39
wind39

Reputation: 451

I'm not sure if I understand your question, but for the first pk of each dept you can try this:

select min(pk) as pk,
       dept
from your_table
group by dept

Upvotes: -2

Related Questions