Reputation: 71
Please see below a table where I would like to create a distinct list of telephone numbers, the ranking GROUP numbers need to take priority descending 1-4. The example should result in ID 4, 9,10 (highlighted as red x) being removed as they share duplicate telephone numbers but in a lower priority GROUP.
I have managed to do this a long way round in many steps and creating a surrogate key, but I would like it if someone could show me how this could be done in a single step, for efficiency (and to sharpen up my SQL).
In SAS9 I had a coalescec function where I was able to create a new variable and instruct the priority - but I know of no such function in SQL.
https://documentation.sas.com/doc/en/vdmmlcdc/8.1/ds2ref/n0crpo0xd76wb3n1poba9wmu1a6q.htm
please see mock data in SQL below
CREATE TABLE MT_TEMP_TEL2 (
ID INT NOT NULL,
GROUP INT NOT NULL,
TEL1 VARCHAR(11),
TEL2 VARCHAR(11),
TEL3 VARCHAR(11),
TEL4 VARCHAR(11)
);
INSERT INTO MT_TEMP_TEL2
VALUES
(1,4,"79000000000","","",""),
(2,1,"12111111111","79999999999","",""),
(3,1,"","13842222222","",""),
(4,2,"","","78888888888","12111111111"),
(5,3,"","73333333333","",""),
(6,1,"15278888888","","",""),
(7,2,"","","71111111111",""),
(8,4,"","13843333333","","72222222222"),
(9,4,"","73333333333","",""),
(10,3,"75555555555","","12155555555","13842222222");
Here is the desired results as stated above, we have now removed rows 4, 9 and 10
ID | GROUP | TEL1 | TEL2 | TEL3 | TEL4 |
---|---|---|---|---|---|
1 | 4 | 79000000000 | |||
2 | 1 | 12111111111 | 79999999999 | ||
3 | 1 | 13842222222 | |||
5 | 3 | 73333333333 | |||
6 | 1 | 15278888888 | |||
7 | 2 | 71111111111 | |||
8 | 4 | 13843333333 | 72222222222 |
Upvotes: 0
Views: 73
Reputation: 32619
I had assumed with the mention of coalesce
that the goal was to have the 4 columns condensed into a single list of values, the expected results example shows now that's not the case.
There are a few ways to tackle this, one is to use window functions
to assign a sequential number to each reccuring value; to do that, the values must be combined into a single list for which you can use union
, the results of which are then partitioned
by the combined values and each duplicate gets a sequential row number. It's then a matter of keeping just those with a value of 1.
with Alltel as (
select id, [group], tel1 from MT_TEMP_TEL2 union all
select id, [group], tel2 from MT_TEMP_TEL2 union all
select id, [group], tel3 from MT_TEMP_TEL2 union all
select id, [group], tel4 from MT_TEMP_TEL2
), rn as (
select id, tel1, Row_Number() over (partition by tel1 order by [group]) n
from alltel
where tel1 !=''
)
select *
from MT_TEMP_TEL2 t
where not exists (select * from rn where rn.id=t.id and rn.n>1)
Upvotes: 1