Reputation: 19617
I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.
Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.
So for example change this:
23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...
Into this:
23515 -> 52525 date_visited
12345 -> 54321 date_visited
Update: Here is the table structure as requested:
id int(10) UNSIGNED Non Aucun AUTO_INCREMENT
profile_id int(10) UNSIGNED Non 0
visitor_id int(10) UNSIGNED Non 0
date_visited timestamp Non CURRENT_TIMESTAMP
Upvotes: 51
Views: 58256
Reputation: 1
None of the above answers works..... the requirement is to delete all the duplicates and except one from each set.. but based on more than one column data..
try this,
SET SERVEROUTPUT ON
declare
name integer := 1;
begin
for test in (
select tablecolumn1 , tablecolumn2, tablecolumn3, count(1) from
yourtable group by tablecolumn1, tablecolumn2, tablecolumn3
having count(1) > 1
)
loop
if name <= 1000 then -- for incremental update
name := name+1;
delete from test b where
b.tablecolumn1 = test.tablecolumn1
and b.tablecolumn2 = test.tablecolumn2
and b.tablecolumn3 = test.tablecolumn3
and rownum = 1;
end if;
end loop;
DBMS_OUTPUT.PUT_LINE(name);
end;
Upvotes: 0
Reputation: 1
If you are using SQL you can manually delete the duplicate rows keeping one entry just follow this procedure:
It's a long procedure but you can see the results immediately in real-time.
Hope this solution worked for you!!
Upvotes: -4
Reputation: 19617
Here's Frank Schmitt's solution with a small workaround utilizing a temporary table to allow his solution to work on MySQL:
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)
Upvotes: 17
Reputation: 30775
ANSI SQL Solution
Use group by in a subquery:
delete from my_tab where id not in
(select min(id) from my_tab group by profile_id, visitor_id);
You need some kind of unique identifier(here, I'm using id).
MySQL Solution
As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);
Upvotes: 86
Reputation: 171
This will work:
With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from
table_name
)
Delete from NewCTE where RowNumber > 1
Upvotes: 17
Reputation: 1156
Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table
That's what I'd do. I'm not sure if there's 1 query that would do all this for you.
Upvotes: 3