Reputation: 49
DATA given:
inventory_num_id inventory_group_id num code
9681066 100003894 211 E
9679839 100003894 212 E
9687165 100003894 213 E
9680883 100003894 214 I
9710863 100003894 515 E
9681246 100003894 516 E
9682695 100003894 517 E
9681239 100003894 518 E
9685409 100003894 519 E
9679843 100003894 520 C
9679844 100003894 521 C
9714882 100003894 522 E
9679845 100003894 523 I
9681211 100003894 524 E
9681216 100003894 525 E
9682696 100003894 526 E
9681227 100003894 527 E
Result examples should be like:
inventory_group_id code start end
------------------ ---- ----- ----
100003894 E 211 213
100003894 I 214
100003894 E 515 519
100003894 C 520 521
100003894 E 522
100003894 I 523
100003894 E 524 527
What operator should I use to make start as minimum and end as max value? And could you please explain what I should do when the end(maximum)is not supposed to present?
Can I use GROUP BY clause there somehow?
Upvotes: 3
Views: 288
Reputation: 191235
As gelonsoft noted, there are many ways to do this. I'd prefer to only hit the table once though. This is my current favourite, based on a method I first found on this site, probably on this answer (and you might find many more approaches among these questions:
select inventory_group_id, code, start_num,
case when end_num = start_num then null else end_num end as end_num
from (
select inventory_group_id, code, min(num) as start_num, max(num) as end_num
from (
select inventory_group_id, num, code,
row_number() over (order by num)
- row_number() over (partition by code order by num) as chain
from inventory_num
)
group by inventory_group_id, code, chain
)
order by 1,3;
INVENTORY_GROUP_ID C START_NUM END_NUM
------------------ - ---------- ----------
100003894 E 211 213
100003894 I 214
100003894 E 515 519
100003894 C 520 521
100003894 E 522
100003894 I 523
100003894 E 524 527
The inner select is doing all the work, by creating artificial groupings based on the code
and num
values - run that on its own to see what it's doing. The next query out is collapsing the groups to find the lowest and highest num
for each of the artificial groups. The final outer query is purely to make the end
value null
if the chain only has one link - i.e. the start
and end
are the same - which you mentioned you wanted.
What you haven't said is whether the num
values have to be contiguous. If I add a record with code='I'
and num=216
, I get the same number of outputs, but 214-216
is treated as one chain, even though there is no 215
value in between:
INVENTORY_GROUP_ID C START_NUM END_NUM
------------------ - ---------- ----------
100003894 E 211 213
100003894 I 214 216
...
I haven't figured out how to adapt this row_number()
method to take account of that and treat them as separate chains - I'd be interested to see if it can be done while keeping it simple. The same thing happens with the other answers given; but I'm not sure if it matters to you.
If it does, here's another version which only hits the table once; rather more convoluted and in this form has the same potential problem with non-contiguous num
values:
select distinct inventory_group_id, code,
case
when prev_code = code then lag(num) over (order by rn)
else num
end as start_num,
case
when next_code != code and prev_code != code then null
when next_code = code then lead(num) over (order by rn)
else num
end as end_num
from (
select inventory_group_id, num, code, prev_code, next_code,
rownum as rn
from (
select inventory_group_id, num, code,
lag(code) over (partition by inventory_group_id
order by num) as prev_code,
lead(code) over (partition by inventory_group_id
order by num) as next_code
from inventory_num
)
where (prev_code is null or code != prev_code)
or (next_code is null or code != next_code)
order by 1,2,3
)
order by 1,3,2;
INVENTORY_GROUP_ID C START_NUM END_NUM
------------------ - ---------- ----------
100003894 E 211 213
100003894 I 214
100003894 E 515 519
100003894 C 520 521
100003894 E 522
100003894 I 523
100003894 E 524 527
The inner query selects from the table, and uses the lead
and lag
analytic functions to find the code either side of each row.
The next query out excludes any rows that have the same code as both the next and previous rows; that is, anything that's in the middle of an uninterrupted chain. That means each chain is collapsed to at most two rows, with the start and end num
values for that chain.
The case
statement in the outer query make both rows for each chain look the same, by using lead
and lag
again; if there is only one row (e.g. for 214
) the first when
clause of the second case
makes the end value null
, which you said you wanted. The distinct
then removes the duplicates the case
created.
I suggest you run each level of the query separately to see what it's doing, and understand what it's doing to the previous one.
As I said, this has the same potential problem if I introduce a row with code='I'
and num=216
:
INVENTORY_GROUP_ID C START_NUM END_NUM
------------------ - ---------- ----------
100003894 E 211 213
100003894 I 214 216
...
This can be split into two chains by adapting this method, but it's a little more complicated as you have to track and compare the num
values as well as the code
values:
select distinct inventory_group_id, code,
case
when prev_num is null then num
when prev_code = code then lag(num) over (order by rn)
else num
end as start_num,
case
when next_code != code and prev_code != code then null
when next_code is null then num
when next_num is null then null
when next_code = code then lead(num) over (order by rn)
else num
end as end_num
from (
select inventory_group_id, num, code, prev_code, next_code,
case
when prev_num != num - 1 then null
else prev_num
end as prev_num,
case
when next_num != num + 1 then null
else next_num
end as next_num,
rownum as rn
from (
select inventory_group_id, num, code,
lag(code) over (partition by inventory_group_id
order by num) as prev_code,
lead(code) over (partition by inventory_group_id
order by num) as next_code,
lag(num) over (partition by inventory_group_id
order by num) as prev_num,
lead(num) over (partition by inventory_group_id
order by num) as next_num
from inventory_num
)
where (prev_code is null or code != prev_code)
or (next_code is null or code != next_code)
or (prev_num is null or num != prev_num + 1)
or (next_num is null or num != next_num - 1)
order by 1,2,3
)
order by 1,3,2;
INVENTORY_GROUP_ID C START_NUM END_NUM
------------------ - ---------- ----------
100003894 E 211 213
100003894 I 214
100003894 I 216
100003894 E 515 519
100003894 C 520 521
100003894 E 522
100003894 I 523
100003894 E 524 527
Upvotes: 1
Reputation: 24134
I've tested this under MS Sql server and I bet it will work under oracle too:
select max(inventory_group_id) inventory_group_id,max(code) Code ,min(num) minNum,max(num) maxNum
from
(
select inventory_group_id,inventory_num_id,code,num,
(select min(num) from DATA where num>
(select max(num) from DATA where DATA.num<a.num and code<>a.code)
) as minNum
from DATA a
) A
group by minNum
order by 3
Upvotes: 1
Reputation: 26
Ann, be careful on the dark side of sql. There are more than one way to do it. Here is the answer:
SELECT a.inventory_group_id,
a.code,
a.num AS "start",
decode(b.num,a.num,null,b.num) AS "end" FROM
( SELECT inventory_num_id,inventory_group_id,code,num
, ROW_NUMBER() OVER (PARTITION BY inventory_group_id,code ORDER BY num) AS rn
FROM inventory_num a
WHERE NOT EXISTS
( SELECT *
FROM inventory_num prev
WHERE prev.inventory_group_id = a.inventory_group_id
and PREV.CODE = a.code
AND prev.num = a.num - 1
)
) a
JOIN
( SELECT inventory_num_id,inventory_group_id,code, num
, ROW_NUMBER() OVER (PARTITION BY inventory_group_id,code ORDER BY num) AS rn
FROM inventory_num a
WHERE NOT EXISTS
( SELECT *
FROM inventory_num next
WHERE next.inventory_group_id = a.inventory_group_id
and next.CODE = a.code
AND next.num = a.num + 1
)
) b
ON b.inventory_group_id = a.inventory_group_id and b.code = a.code
AND b.rn = a.rn
order by 3;
Upvotes: 1