CCP
CCP

Reputation: 129

Over Partition to find duplicates and remove them based on criteria SQL

I hope everyone is doing well. I have a dilemma that i can not quite figure out. I am trying to find a unique value for a field that is not a duplicate.

For example:

Table 1

|Col1  | Col2| Col3 |     
| 123  |  A  |  1   |     
| 123  |  A  |  2   |     
|  12  |  B  |  1   |     
|  12  |  B  |  2   |     
|  12  |  C  |  3   |     
|  12  |  D  |  4   |     
|  1   |  A  |  1   |     
|  2   |  D  |  1   |      
|  3   |  D  |  1   |  

Col 1 is the field that would have the duplicate values. Col2 would be the owner of the value in Col 1. Col 3 uses the row number() Over Partition syntax to get the numbers in ascending order.

The goal i am trying to accomplish is to remove the value in col 1 if it is not truly unique when looking at col2.

Example:

Col1 has the value 123, Col2 has the value A. Although there are two instances of 123 being owned by A, i can determine that it is indeed unique.

Now look at Col1 that has the value 12 with values in Col2 of B,C,D.

Value 12 is associated with three different owners thus eliminating 12 from our result list.

So in the end i would like to see a result table such as this :

|Col1  | Col2| 
| 123  |  A  | 
|  1   |  A  |
|  2   |  D  | 
|  3   |  D  |

To summarize, i would like to first use the partition numbers to identify if the value in col1 is repeated. From there i want to verify that the values in col 2 are the same. If so the value in col 1 and col 2 remains as one single entry. However if the values in col 2 do not match, all records for the col1 value are removed.

I will provide the syntax code for my query if needed.

Update**

I failed to mention that table 1 is the result of inner joining two tables.

So Col1 comes from table a and Col2 comes from table b.

The values in table a for col2 are hard to interpret so i had to make sense of them and assigned it proper name values.

The join query i used to combine the two are:

Select a.Col1, B.Col2 FROM Table a INNER JOIN Table b on a.Colx = b.Colx

Update**

Table a:

|Col1  | Colx| Col3 |     
| 123  | SMS |  1   |     
| 123  | S9W |  2   |     
|  12  | NAV |  1   |     
|  12  | NFR |  2   |     
|  12  | ABC |  3   |     
|  12  | DEF |  4   |     
|  1   | SMS |  1   |     
|  2   | DEF |  1   |      
|  3   | DES |  1   |  

Table b:

|Colx  | Col2| 
| SMS  |  A  | 
| S9W  |  A  |
| DEF  |  D  | 
| DES  |  D  |
| NAV  |  B  |
| NFR  |  B  |
| ABC  |  C  |

Above are sample data for both tables that get joined in order to create the first table displayed in this body.

Thank you all so much!

Upvotes: 2

Views: 192

Answers (2)

krokodilko
krokodilko

Reputation: 36107

NOT EXISTS operator can be used to do this task:

SELECT distinct Col1 ,  Col2
FROM table t
WHERE NOT EXISTS(
  SELECT 1 FROM table t1
  WHERE t.col1=t1.col1 AND t.col2 <> t1.col2
)

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269973

If I understand correctly, you want:

select col1, min(col2)
from t
group by col1
where min(col2) <> max(col2);

I think the third column is confusing you. It doesn't seem to play any role in the logic you want.

Upvotes: 0

Related Questions