Sebastianb
Sebastianb

Reputation: 2060

Limit ammount of rows deleted in table without PK

I'm working on a system where, for some reason, someone created a "categories" table without PK'ing the category's ID. I just found out that now the table has a duplicate of all the rows and is causing issues with some of my queries.

My question is: Is there a way to delete duplicate rows, leaving the 'originals' in the table, without relying on indexes?

Here's the SELECT * FROM categories:

+-----------+-------------+
| categoria | descripcion |
+-----------+-------------+
|         1 | Int.Cons.   |
|         2 | Delegado    |
|         3 | Personal    |
|         4 | Comun       |
|         5 | Proveedor   |
|         6 | Menor/Inc   |
|        11 | N/Categoria |
|         1 | Int.Cons.   |
|         2 | Delegado    |
|         3 | Personal    |
|         4 | Comun       |
|         5 | Proveedor   |
|         6 | Menor/Inc   |
|        11 | N/Categoria |
+-----------+-------------+

There's no index defined on the table.

Upvotes: 0

Views: 34

Answers (4)

Gordon Linoff
Gordon Linoff

Reputation: 1271241

You can delete the records in a couple of ways. I would probably recommend using a temporary table and then re-populating the table correctly:

create temporary table temp_categories as 
    select c.id, c.descripcion
    from categories c
    group by c.id;

truncate table categories;

insert into categories(id, descripcion)
    select id, descripcion
    from temp_categories;


alter table add constraint primary key (id);
alter table add constraint unq_categories_descripcion unique (descripcion);

Upvotes: 1

Uueerdo
Uueerdo

Reputation: 15971

If/since creating a new table is not an option, you can run something like this; but you'll need to run it repeatedly until it all duplicates are gone.

DELETE FROM categories 
WHERE (categoria, descripcion) IN (
   SELECT categoria, descripcion 
   FROM categories 
   GROUP BY categoria, descripcion 
   HAVING COUNT(*) > 1
)
LIMIT 1
;

Edit: actually, this should have the benefit of the temp table solution without the risk (since true temp tables are session/connection based, if your connection fails you would lose the data).

INSERT INTO categories (categoria, descripcion)
SELECT DISTINCT categoria, CONCAT('NEWCOPY---', descripcion)
FROM categories
;

DELETE FROM categories
WHERE descripcion NOT LIKE 'NEWCOPY---%'
;

UPDATE categories
SET descripcion = REPLACE(descripcion, 'NEWCOPY---', '')
;

Obviously, 'NEWCOPY---' can/should be replaced with a prefix you KNOW is not present in the table.

Upvotes: 1

user1599615
user1599615

Reputation:

test this

 delete a from yourtable a1 , yourtable a2 where a1.categoria=a2.categoria

Upvotes: 0

Norbert
Norbert

Reputation: 6084

In this example you gave you can do this pretty easy:

 CREATE TABLE xyz AS SELECT DISTINCT FROM categories;
 DROP TABLE categories;
 ALTER TABLE xyz RENAME categories;
 CREATE UNIQUE INDEX idx_nn_c_1 ON categories(id);

(Check the SQL before running else you will loose a table at the DROP line)

Upvotes: 0

Related Questions