Lyndey
Lyndey

Reputation: 77

How to remove redundant data from SQL query

Hi: I'm pulling data from a table that has a main id code linked to a user, but each time the user changes his/her name, an extra record is added. I'm trying to pull a list of current users and any old names they might have used in the past. I'm using an outer join to pick up at least one former name and one extra former name.

Here's the query:

select
PrimaryName.PM_PDM,
PrimaryName.PM_ID,
PrimaryName.PM_AltID,
PrimaryName.PM_Change,
PrimaryName.PM_FName,
PrimaryName.PM_LName,
OldNames1.ON_PDM,
OldNames1.ON_Change,
OldNames1.ON_ID,
OldNames1.ON_AltID,
OldNames1.ON_FName,
OldNames1.ON_LName,
OldNames2.O2_PDM,
OldNames2.O2_Change,
OldNames2.O2_ID,
OldNames2.O2_AltID,
OldNames2.O2_FName,
OldNames2.O2_LName

from
(select
S_PDM as PM_PDM,
S_ID as PM_ID,
S_FIRST_NAME as PM_FName,
S_LAST_NAME as PM_LName,
S_CHANGE_IND as PM_Change,
S_SURROGATE_ID as PM_AltID
from S
WHERE S_CHANGE_IND is null) PrimaryName,

(select
S_PDM as ON_PDM,
S_ID  as ON_ID,
S_FIRST_NAME as ON_FName,
S_LAST_NAME as ON_LName,
S_CHANGE_IND as ON_Change,
S_SURROGATE_ID as ON_AltID
from S
where S_CHANGE_IND = 'N') OldNames1,

(select
S_PDM as O2_PDM,
S_ID  as O2_ID,
S_FIRST_NAME as O2_FName,
S_LAST_NAME as O2_LName,
S_CHANGE_IND as O2_Change,
S_SURROGATE_ID as O2_AltID
from S
where S_CHANGE_IND = 'N') OldNames2


where (OldNames1.ON_PDM = PrimaryName.pm_pdm)
and
  (OldNames1.ON_PDM = OldNames2.O2_PDM (+)
   and
   OldNames1.ON_AltID <> OldNames2.O2_AltID (+))

order by 2

here is a sample of my result:

PM_PDM  |PM_ID  |PM_ID2 |PM_CHANGE  |PM_FNAME   |PM_LNAME   |ON_PDM |ON_CHANGE  |ON_ID  |ON_ID2 |ON_FNAME   |ON_LNAME   |O2_PDM     |O2_CHANGE  |O2_ID2 |O2_ID  |O2_FNAME   |O2_LNAME
1111    |2222   |3333   |           |Betty      |Boop       |1111   |N          |2222   |4444   |Betty      |Smith      |1111       |N          |5555   |2222   |Betty      |Jones
1111    |2222   |3333   |           |Betty      |Boop       |1111   |N          |2222   |5555   |Betty      |Jones      |1111       |N          |4444   |2222   |Betty      |Smith

I just one line returned for the three names:

  1. Betty Boop 2.Betty Smith 3. Betty Jones

Right now, it's returning

  1. Betty Boop 2.Betty Smith 3.Betty Jones
  2. Betty Boop 2.Betty Jones 3.Betty Smith

I know it's the last join, but I'm not sure how to limit it to only one line. The query is working the way it's supposed to, but I need to edit it to only return one line.

Upvotes: 1

Views: 652

Answers (2)

Ponder Stibbons
Ponder Stibbons

Reputation: 14848

Assign numbers to each change using row_number() and join these data twice using only rows where RN = 1 and RN = 2:

with c as (select s.*, row_number() over (partition by pdm order by id desc) rn
             from s where chg = 'N')
select s.pdm, s.id, s.name, c1.id id1, c1.name name1, c2.id id2, c2.name name2
  from (select * from s where chg is null) s
  left join c c1 on c1.pdm = s.pdm and c1.rn = 1
  left join c c2 on c2.pdm = s.pdm and c2.rn = 2

Test:

with s(pdm, id, chg, name) as (select 1, 1, 'N',  'Smith' from dual union all
                               select 1, 2, 'N',  'Jones' from dual union all
                               select 1, 3, null, 'Brown' from dual),
     c as (select s.*, row_number() over (partition by pdm order by id desc) rn
             from s where chg = 'N')
select s.pdm, s.id, s.name, c1.id id1, c1.name name1, c2.id id2, c2.name name2
  from      (select * from s where chg is null) s
  left join c c1 on c1.pdm = s.pdm and c1.rn = 1
  left join c c2 on c2.pdm = s.pdm and c2.rn = 2


PDM  ID   NAME    ID1  NAME1   ID2  NAME2
---  ---  ------  ---  ------  ---  ------
  1    3  Brown     2  Jones     1  Smith

Upvotes: 1

APC
APC

Reputation: 146229

Change the last join condition to this:

 OldNames1.ON_AltID < OldNames2.O2_AltID (+)

Explanation: you have two AltID values 4444 and 5555. 4444 <> 5555 is true and so is 5555 <> 4444. Hence your existing condition ....

 OldNames1.ON_AltID <> OldNames2.O2_AltID (+)

.... produces a cross join and so you get two records. Changing the condition to join on less than eliminates the cross join because 5555 < 4444 is false.

Upvotes: 1

Related Questions