Reputation: 471
I have a query to check the duplicates from a table:-
(SELECT assignment_name
FROM (SELECT xx.supervisor_stg.*,
Row_number()
over (
PARTITION BY assignment_name
ORDER BY assignment_name) AS c
FROM xx.supervisor_stg)
WHERE c > 1)
I want to delete the duplicate data from these data but such that if one value like assignment_name 'Smith' is written thrice then 1 value should remain and the other 2 should be deleted.
When i am using the following query i am getting an error :
Delete From supervisor_stg
where asg_name <>(SELECT Asg_Name
FROM (SELECT xx.supervisor_stg.*,
Row_number()
over (
PARTITION BY assignment_name
ORDER BY assignment_name) AS c
FROM xx.supervisor_stg)
WHERE c > 1)
SQL Error: ORA-01427: single-row subquery returns more than one row
01427. 00000 - "single-row subquery returns more than one row"
*Cause:
*Action:
Upvotes: 0
Views: 125
Reputation: 94939
In Oracle you can delete from a subquery, e.g. delete from (select * from emp where empno < 10)
. Well, the example is stupid, but you get the idea. In your case this would be:
delete from
(
select *
from
(
select
s.*,
row_number() over (partition by assignment_name order by assignment_name) as rn
from xx.supervisor_stg s
)
where rn > 1
);
Although this should work, it may happen that Oracle doesn't except this for some reason. Update: I've tried this in Oracle 11.2 and Oracle gives me ORA-01752: "cannot delete from view without exactly one key-preserved table". So although we are selecting from one table only, Oracle gets confused somehow. Read on and pick another solution :-)
An alternative would be to delete every record for which doesn't exist another entry with the same assignment_name and a lower (or higher) rowid.
delete from xx.supervisor_stg s
where not exists
(
select *
from xx.supervisor_stg other
where other.assignment_name = s.assignment_name
and other.rowid < s.rowid
);
Instead of a correlated subquery with NOT EXISTS
you can use a non-correlated one with NOT IN
, where you have all minimum (or maximum) rowids in a set and you delete all others.
delete from xx.supervisor_stg
where rowid not in
(
select min(rowid)
from xx.supervisor_stg
group by assignment_name
);
As to your own delete statement:
<>
with NOT IN
for this reason.<>
makes no sense in combination with WHERE c > 1
, because the latter gives you the records to delete not those to keep. Either make this IN
(rather than NOT IN
) or make it WHERE c = 1
.You'd get:
Delete From supervisor_stg
Where rowid In
(
SELECT rowid
FROM
(
SELECT
xx.supervisor_stg.*,
Row_number() over (PARTITION BY assignment_name ORDER BY assignment_name) AS c
FROM xx.supervisor_stg
)
WHERE c > 1
);
which is overly complicated. As shown you can use simple aggregation (min(rowid) / group by assignment_name
). You'd use a window function such as row_number()
to avoid to have to read a table twice like in my first query. In your query you are still reading the table twice, once to find the records to keep, and then another time to find the records to delete.
Upvotes: 2
Reputation: 1269973
In Oracle, you can use rowid
for this purpose:
delete from xx.SUPERVISOR_STG
where rowid <> (select min(s2.rowid)
from xx.SUPERVISOR_STG
where s2.assignment_name = SUPERVISOR_STG.assignment_name
);
Upvotes: 0
Reputation: 3299
;WITH numbered AS (
SELECT ROW_NUMBER() OVER(PARTITION BY [dupe-column-list] ORDER BY [dupe-column-list]) AS _dupe_num
FROM [table-name]
WHERE 1=1 -- any where clause if required
)
DELETE FROM numbered WHERE _dupe_num > 1;
This query will assign a row number to your table, based on the [dupe-column-list] fields you'll add there. You can provide an order for those records. The delete statement will remove all records showing up more than once for that [dupe-column-list] (except the first occurrance).
Edit: just noticed this is oracle. Not sure if the above will work for you then. The above is MSSQL.
Upvotes: 0