Anna Mamonova
Anna Mamonova

Reputation: 49

Oracle SQL. What statement should I use

DATA given:

 inventory_num_id        inventory_group_id             num   code
         9681066                 100003894             211      E
         9679839                 100003894             212      E
         9687165                 100003894             213      E
         9680883                 100003894             214      I
         9710863                 100003894             515      E
         9681246                 100003894             516      E
         9682695                 100003894             517      E
         9681239                 100003894             518      E
         9685409                 100003894             519      E
         9679843                 100003894             520      C
         9679844                 100003894             521      C
         9714882                 100003894             522      E
         9679845                 100003894             523      I
         9681211                 100003894             524      E
         9681216                 100003894             525      E
         9682696                 100003894             526      E
         9681227                 100003894             527      E

Result examples should be like:

inventory_group_id   code start  end 
------------------   ---- -----  ---- 
         100003894      E   211   213
         100003894      I   214
         100003894      E   515   519
         100003894      C   520   521
         100003894      E   522
         100003894      I   523
         100003894      E   524   527

What operator should I use to make start as minimum and end as max value? And could you please explain what I should do when the end(maximum)is not supposed to present?

Can I use GROUP BY clause there somehow?

Upvotes: 3

Views: 288

Answers (3)

Alex Poole
Alex Poole

Reputation: 191235

As gelonsoft noted, there are many ways to do this. I'd prefer to only hit the table once though. This is my current favourite, based on a method I first found on this site, probably on this answer (and you might find many more approaches among these questions:

select inventory_group_id, code, start_num,
    case when end_num = start_num then null else end_num end as end_num
from (
    select inventory_group_id, code, min(num) as start_num, max(num) as end_num
    from (
        select inventory_group_id, num, code,
            row_number() over (order by num)
              - row_number() over (partition by code order by num) as chain
        from inventory_num
    )
    group by inventory_group_id, code, chain
)
order by 1,3;

INVENTORY_GROUP_ID C  START_NUM    END_NUM
------------------ - ---------- ----------
         100003894 E        211        213
         100003894 I        214
         100003894 E        515        519
         100003894 C        520        521
         100003894 E        522
         100003894 I        523
         100003894 E        524        527

The inner select is doing all the work, by creating artificial groupings based on the code and num values - run that on its own to see what it's doing. The next query out is collapsing the groups to find the lowest and highest num for each of the artificial groups. The final outer query is purely to make the end value null if the chain only has one link - i.e. the start and end are the same - which you mentioned you wanted.

What you haven't said is whether the num values have to be contiguous. If I add a record with code='I' and num=216, I get the same number of outputs, but 214-216 is treated as one chain, even though there is no 215 value in between:

INVENTORY_GROUP_ID C  START_NUM    END_NUM
------------------ - ---------- ----------
         100003894 E        211        213
         100003894 I        214        216
...

I haven't figured out how to adapt this row_number() method to take account of that and treat them as separate chains - I'd be interested to see if it can be done while keeping it simple. The same thing happens with the other answers given; but I'm not sure if it matters to you.

If it does, here's another version which only hits the table once; rather more convoluted and in this form has the same potential problem with non-contiguous num values:

select distinct inventory_group_id, code,
    case
        when prev_code = code then lag(num) over (order by rn)
        else num
    end as start_num,
    case
        when next_code != code and prev_code != code then null
        when next_code = code then lead(num) over (order by rn)
        else num
    end as end_num
from (
    select inventory_group_id, num, code, prev_code, next_code,
        rownum as rn
    from (
        select inventory_group_id, num, code,
            lag(code) over (partition by inventory_group_id
                order by num) as prev_code,
            lead(code) over (partition by inventory_group_id
                order by num) as next_code
        from inventory_num
    )
    where (prev_code is null or code != prev_code)
        or (next_code is null or code != next_code)
    order by 1,2,3
)
order by 1,3,2;

INVENTORY_GROUP_ID C  START_NUM    END_NUM
------------------ - ---------- ----------
         100003894 E        211        213
         100003894 I        214
         100003894 E        515        519
         100003894 C        520        521
         100003894 E        522
         100003894 I        523
         100003894 E        524        527

The inner query selects from the table, and uses the lead and lag analytic functions to find the code either side of each row.

The next query out excludes any rows that have the same code as both the next and previous rows; that is, anything that's in the middle of an uninterrupted chain. That means each chain is collapsed to at most two rows, with the start and end num values for that chain.

The case statement in the outer query make both rows for each chain look the same, by using lead and lag again; if there is only one row (e.g. for 214) the first when clause of the second case makes the end value null, which you said you wanted. The distinct then removes the duplicates the case created.

I suggest you run each level of the query separately to see what it's doing, and understand what it's doing to the previous one.

As I said, this has the same potential problem if I introduce a row with code='I' and num=216:

INVENTORY_GROUP_ID C  START_NUM    END_NUM
------------------ - ---------- ----------
         100003894 E        211        213
         100003894 I        214        216
...

This can be split into two chains by adapting this method, but it's a little more complicated as you have to track and compare the num values as well as the code values:

select distinct inventory_group_id, code,
    case
        when prev_num is null then num
        when prev_code = code then lag(num) over (order by rn)
        else num
    end as start_num,
    case
        when next_code != code and prev_code != code then null
        when next_code is null then num
        when next_num is null then null
        when next_code = code then lead(num) over (order by rn)
        else num
    end as end_num
from (
    select inventory_group_id, num, code, prev_code, next_code,
        case
            when prev_num != num - 1 then null
            else prev_num
        end as prev_num,
        case
            when next_num != num + 1 then null
            else next_num
        end as next_num,
        rownum as rn
    from (
        select inventory_group_id, num, code,
            lag(code) over (partition by inventory_group_id
                order by num) as prev_code,
            lead(code) over (partition by inventory_group_id
                order by num) as next_code,
            lag(num) over (partition by inventory_group_id
                order by num) as prev_num,
            lead(num) over (partition by inventory_group_id
                order by num) as next_num
        from inventory_num
    )
    where (prev_code is null or code != prev_code)
        or (next_code is null or code != next_code)
        or (prev_num is null or num != prev_num + 1)
        or (next_num is null or num != next_num - 1)
    order by 1,2,3
)
order by 1,3,2;

INVENTORY_GROUP_ID C  START_NUM    END_NUM
------------------ - ---------- ----------
         100003894 E        211        213
         100003894 I        214
         100003894 I        216
         100003894 E        515        519
         100003894 C        520        521
         100003894 E        522
         100003894 I        523
         100003894 E        524        527

Upvotes: 1

valex
valex

Reputation: 24134

I've tested this under MS Sql server and I bet it will work under oracle too:

select max(inventory_group_id) inventory_group_id,max(code) Code ,min(num) minNum,max(num) maxNum

from
(
select inventory_group_id,inventory_num_id,code,num,
      (select min(num) from DATA where num>
             (select max(num) from DATA where DATA.num<a.num and code<>a.code) 
      ) as minNum
from DATA a
) A

group by minNum
order by 3

Upvotes: 1

gelonsoft
gelonsoft

Reputation: 26

Ann, be careful on the dark side of sql. There are more than one way to do it. Here is the answer:

SELECT a.inventory_group_id,
   a.code,
  a.num        AS "start",
  decode(b.num,a.num,null,b.num) AS "end" FROM
  ( SELECT inventory_num_id,inventory_group_id,code,num
         , ROW_NUMBER() OVER (PARTITION BY inventory_group_id,code ORDER BY num) AS rn
    FROM inventory_num  a
    WHERE NOT EXISTS
          ( SELECT * 
            FROM inventory_num  prev
            WHERE prev.inventory_group_id = a.inventory_group_id
            and  PREV.CODE = a.code
              AND prev.num = a.num - 1
          ) 
  )  a
JOIN
  ( SELECT  inventory_num_id,inventory_group_id,code, num 
         , ROW_NUMBER() OVER (PARTITION BY inventory_group_id,code ORDER BY num) AS rn
    FROM inventory_num  a
    WHERE NOT EXISTS
          ( SELECT * 
            FROM inventory_num  next
            WHERE next.inventory_group_id = a.inventory_group_id
            and  next.CODE = a.code
              AND next.num = a.num + 1
          )
  )  b
ON  b.inventory_group_id = a.inventory_group_id and b.code = a.code
AND b.rn  = a.rn
order by 3;

Upvotes: 1

Related Questions