Tom J Muthirenthi
Tom J Muthirenthi

Reputation: 3340

Duplicate rows oracle based on Column

I want to duplicate a row based on the Parameter(Parameter is 999 by default) Column and ID. For example in the below sample, we have a threshold value 999, If an ID has a row with ParamComp = 999 and another row with ParamComp <>999, then for the row with ParamComp <>999 we must create a new record with the ColVal of ParamComp = 999.

If an ID has rows with only ParamComp = 999, Just load it to the target directly (No duplication logic is needed).

Also If an ID has rows with only ParamComp <> 999, Just load it to the target directly (No duplication logic is needed)

Input Data

id  ParamComp   ColVal 
1   999         a
1   80          b
2   999         c
3   85          d

Target Data

id  ParamComp   ColVal  
1   999         a
1   80          b
1   80          a
2   999         c
3   85          d

Upvotes: 1

Views: 164

Answers (2)

Boneist
Boneist

Reputation: 23578

An alternative to Gordon's answer (which may or may not be faster) is to do a partial cross join on a two-row dummy "table", like so:

WITH your_table AS (SELECT 1 ID, 999 paramcomp, 'a' colval FROM dual UNION ALL
                    SELECT 1 ID, 80 paramcomp, 'b' colval FROM dual UNION ALL
                    SELECT 2 ID, 999 paramcomp, 'c' colval FROM dual UNION ALL
                    SELECT 3 ID, 85 paramcomp, 'd' colval FROM dual UNION ALL
                    SELECT 4 ID, 999 paramcomp, 'e' colval FROM dual UNION ALL
                    SELECT 4 ID, 75 paramcomp, 'f' colval FROM dual UNION ALL
                    SELECT 4 ID, 70 paramcomp, 'g' colval FROM dual)
-- end of mimicking your table; see SQL below:
SELECT yt.ID,
       yt.paramcomp,
       case WHEN dummy.id = 1 THEN yt.colval
            WHEN dummy.id = 2 THEN yt.paramcomp_999_colval
       END colval
FROM   (SELECT ID,
               paramcomp,
               colval,
               MAX(CASE WHEN paramcomp = 999 THEN colval END) OVER (PARTITION BY ID) paramcomp_999_colval
        FROM   your_table) yt
       INNER JOIN (SELECT 1 ID FROM dual UNION ALL
                   SELECT 2 ID FROM dual) dummy ON dummy.id = 1 -- ensures every yt row is returned
                                                   OR (dummy.id = 2
                                                       AND paramcomp_999_colval IS NOT NULL
                                                       AND yt.paramcomp != 999) -- returns an extra row if the 999 paramcomp row exists but the current row isn't 999
ORDER BY yt.ID, yt.paramcomp DESC, yt.colval;

        ID  PARAMCOMP COLVAL
---------- ---------- ------
         1        999 a
         1         80 b
         1         80 a
         2        999 c
         3         85 d
         4        999 e
         4         75 e
         4         75 f
         4         70 g
         4         70 e

This assumes that there is only ever one 999 paramcomp row per id (e.g. a unique constraint on (id, paramcomp) exists).

You'd have to test this and Gordon's answer to see which is most performant against your data.


ETA: here's a fixed version of Gordon's answer for you to compare with:

select id, paramcomp, colval
from your_table
union all
select id, paramcomp, paramcomp_999_colval colval
from (select yt.*, MAX(CASE WHEN paramcomp = 999 THEN colval END) OVER (PARTITION BY ID) paramcomp_999_colval
      from your_table yt
     ) t
where paramcomp_999_colval IS NOT NULL and paramcomp <> 999
ORDER BY ID, paramcomp DESC, colval;

ETA2: More explanation of the use of the dummy table:

If you wanted to duplicate all rows in your table, you would do a cross join to a table/subquery that has two rows, like so:

SELECT *
FROM   your_table yt
CROSS JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1        999 a               1
         1         80 b               1
         2        999 c               1
         3         85 d               1
         4        999 e               1
         4         75 f               1
         4         70 g               1
         1        999 a               2
         1         80 b               2
         2        999 c               2
         3         85 d               2
         4        999 e               2
         4         75 f               2
         4         70 g               2

However, you don't always want the duplicate row to appear, so you need to do an inner join that's a bit selective. I'll break down the inner join in my initial answer so you can hopefully see what it's doing a bit better.

First, here's the part of the join that ensures that each row in your_table is returned:

SELECT *
FROM   your_table yt
INNER JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy ON dummy.id = 1;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1        999 a               1
         1         80 b               1
         2        999 c               1
         3         85 d               1
         4        999 e               1
         4         75 f               1
         4         70 g               1

Next, here's the part of the join that ensures the selective joining

SELECT *
FROM   your_table yt
INNER JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy ON dummy.id = 2
                                            AND yt.paramcomp != 999;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1         80 b               2
         3         85 d               2
         4         75 f               2
         4         70 g               2

You can see with this second part that we still get the id = 3 row, which we don't want. So, in my final answer above, I found out what the colval of the paramcomp = 999 row was and returned that for all rows, using a conditional max analytic function. Then, I added that into the 2nd join condition part to only return rows that had a 999 colval (if they don't have a value, then we assume that the 999 row doesn't exist). This does assume that the colval will always be present for the 999 row.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269583

If I assume that 999 is the maximum value for paramcomp, then an analytic function and union all can solve the problem. Following the rules you specify in the text this would be:

select id, paramcomp, colval
from t
union all
select id, 999 as paramcomp, colval
from (select t.*, max(paramcomp) over (partition by id) as max_paramcomp
      from t
     ) t
where max_paramcomp = 999 and paramcomp <> 999;

This is easily modified for a simple variant of the rules.

Upvotes: 0

Related Questions