Reputation: 31
I am trying to pair left right parts in a table. The table contains columns Unique Part IDs, Identifiers, and Left/Right Indicator. I received an answer for a similar question assuming the left part always come before the right but I have found that this is not the case.
Currently the only way to identify a pair is if they have the same identifier and the matching part is one part before or one part after, if a right hand part has the same identifier as the left hand subsequent part then that is the correct pair but if the left hand part has the same identifier as the right subsequent part then that is the correct pair. For example (Currently ID, Identifier, are VARCHAR2)
(ID 4 will be Left and ID 5 will be Right). However, there are many Unique IDs between each Left Right Pair that do not have a pair. I'm trying to write a query to find all Left and Right pairs within the table.
For example (currently Both ID and LR are VARCHAR2)
ID LR Identifier
1 L B15A
2 R A15C
3 L A15C
4 R A15C
5 L A15C
6 R D5A2
9 R D5A2
10 L E5A6
11 R E5A6
12 L E5A6
13 R E5A6
14 R H9S5
17 L EE5A
18 R EE5A
and I need the query to return
ID LR Identifier
2 R A15C
3 L A15C
4 R A15C
5 L A15C
10 L E5A6
11 R E5A6
12 L E5A6
13 R E5A6
17 L EE5A
18 R EE5A
The link to the previous question is Here. The problem is if I search by Left before Right (or vice versa) I will incorrectly match pairs. Therefore the query must find the first identifier in the chain and then pair the subsequent part if it exists regardless of if it is a right or left hand For example an incorrect example is:
ID LR Identifier
3 L A15C
4 R A15C
10 L E5A6
11 R E5A6
12 L E5A6
13 R E5A6
17 L EE5A
18 R EE5A
Any help is greatly appreciated! Thank you in advance.
Upvotes: 0
Views: 222
Reputation:
Here is a solution that works more generally, even if the pairs are not necessarily found right next to each other. (If that is in fact REQUIRED, if parts cannot be paired if their ID's are not consecutive, that condition can be added to the query.)
with
test_data ( id, lr, identifier ) as (
select '001', 'L', 'B15A' from dual union all
select '002', 'R', 'A15C' from dual union all
select '003', 'L', 'A15C' from dual union all
select '004', 'R', 'A15C' from dual union all
select '005', 'L', 'A15C' from dual union all
select '006', 'R', 'D5A2' from dual union all
select '009', 'R', 'D5A2' from dual union all
select '010', 'L', 'E5A6' from dual union all
select '011', 'R', 'E5A6' from dual union all
select '012', 'L', 'E5A6' from dual union all
select '013', 'R', 'E5A6' from dual union all
select '014', 'R', 'H9S5' from dual union all
select '017', 'L', 'EE5A' from dual union all
select '018', 'R', 'EE5A' from dual
)
-- end of test data, the solution (SQL query) begins below this line
select id, lr, identifier
from ( select id, lr, identifier,
row_number() over (partition by identifier, lr order by id) as rn,
least( count(case when lr = 'L' then 1 end) over (partition by identifier),
count(case when lr = 'R' then 1 end) over (partition by identifier)
) as least_count
from test_data
)
where rn <= least_count
order by id -- ORDER BY is optional
;
Output:
ID LR IDENTIFIER
--- -- ----------
002 R A15C
003 L A15C
004 R A15C
005 L A15C
010 L E5A6
011 R E5A6
012 L E5A6
013 R E5A6
017 L EE5A
018 R EE5A
10 rows selected
Explanation: In the inner query, I add two more columns to the initial data. One, rn
, counts separately (starting from 1 and incrementing by 1) for each identifier, separately for 'L' and for 'R'. This will be used to form the pairs. And, ct
gives the least of the total counts for 'L' and 'R' for each identifier. In the outer query, I just filter out all rows where rn > ct
- those are the rows without a pair in the initial table. What's left are the pairs.
ADDED: With the additional condition that a pair must be formed from "consecutive" rows (as measured by the id
column), this becomes a more interesting question. It's a gaps-and-islands problem (identify groups of consecutive rows with the same characteristic), but with a twist: the LR
value must be alternating within the group, rather than constant. The very efficient "tabibitosan" method can't be applied here (I think); the "start of group" method, which is more general, does work. This is what I used here. Note that in the end I leave out the very last row in a group, if the count for the group is an odd number. (We may find two, or four, or six consecutive rows that form one or two or three pairs, but not an odd number of rows with alternating LR). Note also that if two rows have the same identifier AND LR, the second row will always start a NEW group, so if it is in fact part of a pair (with the row AFTER it), that will be caught correctly by this solution.
Compare this to the MATCH_RECOGNIZE solution for Oracle 12 and above I posted separately - and appreciate how much simpler it is!
with
prep ( id, lr, identifier, flag ) as (
select id, lr, identifier,
case when identifier = lag(identifier) over (order by id)
and lr != lag(lr) over (order by id)
then null else 1 end
from test_data -- replace "test_data" with actual table name
),
with_groups ( id, lr, identifier, gp ) as (
select id, lr, identifier,
sum(flag) over (order by id)
from prep
),
with_rn ( id, lr, identifier, rn, ct ) as (
select id, lr, identifier,
row_number() over (partition by identifier, gp order by id),
count(*) over (partition by identifier, gp)
from with_groups
)
select id, lr, identifier
from with_rn
where rn < ct or mod(rn, 2) = 0
order by id -- ORDER BY is optional
;
Upvotes: 2
Reputation:
For completeness, here is an answer that uses the Oracle 12.1 and above MATCH_RECOGNIZE clause. (It will not help the OP at this time, he is on 11.2, but it may help others in the future.) It assumes the additional condition that a pair can only be formed if the rows forming the pair are consecutive in the original table.
select id, lr, identifier
from test_data -- replace "test_data" with the actual table name
match_recognize (
order by id
all rows per match
pattern ( A B )
define B as B.identifier = A.identifier and B.lr != A.lr
)
order by id -- ORDER BY is optional
;
Upvotes: 0