Reputation: 1
I have a table which has multiple values in a particular column, I am trying to write two sql query which will identify the entire set of duplicates and entire set of unique records, I wrote few queries but in the unique set I am getting one of the record from the duplicate set.
sample data,
pay_id, pay_ratio, pay_type, cor_id
123, 12, C, Annual
123, 12, C, Annual
456, 13, A, Semi-Annual
476, 43, B, Monthly
987, 32, H, Daily
987, 32, H, Daily
I am trying to separate the above data set as below.
Unique Data Set
pay_id, pay_ratio, pay_type, cor_id
456, 13, A, Semi-Annual
476, 43, B, Monthly
Duplicate Data set
pay_id, pay_ratio, pay_type, cor_id
123, 12, C, Annual
123, 12, C, Annual
987, 32, H, Daily
987, 32, H, Daily
can someone suggest me how can I achieve this using sql query.
Regards, Sai
Upvotes: 0
Views: 289
Reputation: 362
Group and count:
SELECT
pay_id, pay_ratio, pay_type, cor_id,
COUNT(*) AS [Duplicates]
FROM table
GROUP BY pay_id, pay_ratio, pay_type, cor_id
HAVING COUNT(*) = 0 /* unique; > 0 - duplicates */
ORDER BY COUNT(*) ASC
Upvotes: 0
Reputation: 38023
using a common table expression with row_number()
Unique:
;with cte as (
select *
, rn = row_number() over (partition by pay_id order by pay_type)
from t
)
select *
from cte
where not exists (
select 1
from cte i
where i.pay_id = cte.pay_id
and i.rn > 1
)
rextester demo: http://rextester.com/LHIGTY81886
returns:
+--------+-----------+----------+-------------+----+
| pay_id | pay_ratio | pay_type | cor_id | rn |
+--------+-----------+----------+-------------+----+
| 456 | 13 | A | Semi-Annual | 1 |
| 476 | 43 | B | Monthly | 1 |
+--------+-----------+----------+-------------+----+
Duplicates:
;with cte as (
select *
, rn = row_number() over (partition by pay_id order by pay_type)
from t
)
select *
from cte
where exists (
select 1
from cte i
where i.pay_id = cte.pay_id
and i.rn > 1
)
returns:
+--------+-----------+----------+--------+----+
| pay_id | pay_ratio | pay_type | cor_id | rn |
+--------+-----------+----------+--------+----+
| 123 | 12 | C | Annual | 1 |
| 123 | 12 | C | Annual | 2 |
| 987 | 32 | H | Daily | 1 |
| 987 | 32 | H | Daily | 2 |
+--------+-----------+----------+--------+----+
Upvotes: 0
Reputation: 72175
You can do this using COUNT() OVER()
:
SELECT pay_id, pay_ratio, pay_type, cor_id,
CASE
WHEN COUNT(*) OVER (PARTITION BY pay_id, pay_ratio, pay_type, cor_id) = 1
THEN 'unique'
ELSE 'dupl'
END AS type
FROM mytable
The above query returns 'unique'
for unique records and 'dupl'
for duplicates. You can wrap the query in a CTE
or subquery and filter it as you like.
Note: The above query is based on the assumption that all 4 fields of the table determine a duplicate record. You may change the PARTITION BY
clause as you like to address some other duplicate 'logic'.
Upvotes: 3