Reputation: 14544
I'm trying to clean up some incorrect data:
id | name
-------------
1 | C
2 | A
3 | A
4 | B
5 | B
6 | B
7 | B
8 | X
9 | X
10 | A
11 | A
12 | A
13 | X
14 | X
15 | B
16 | C
17 | C
18 | X
19 | A
20 | A
What has happened is when the data has been entered, if the name
field was NULL, the value from the previous iteration of a loop has not been cleared so it has been entered into the next row.
The data should look like this:
id | name
-------------
1 | C
2 | A
3 | NULL
4 | B
5 | NULL
6 | NULL
7 | NULL
8 | X
9 | NULL
10 | A
11 | NULL
12 | NULL
13 | X
14 | NULL
15 | B
16 | C
17 | NULL
18 | X
19 | A
20 | NULL
Is there a way I can update the entire table in one swoop by setting all duplicates like this to NULL, while preserving the values where the column had an intended value?
Upvotes: 1
Views: 215
Reputation: 26784
User variables to mimic a row number per each name.Tested on my machine
SELECT @var:=name,@no:=0 FROM t ORDER BY id;
UPDATE t join
(select ID,NAME,(CASE WHEN NAME=@var THEN @no:=@no+1 ELSE @no:=1 AND @var:=NAME END) BLAH
from T
order by ID)X
on T.ID=X.ID
SET T.NAME= NULL
where X.BLAH<>0
Upvotes: 0
Reputation: 17061
We need to join our duplicates into chain, it easy to reach by:
select *
from updateTable t1
left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
;
+----+------+------+------+
| id | name | id | name |
+----+------+------+------+
| 1 | C | NULL | NULL |
| 2 | A | 3 | A |
| 3 | A | NULL | NULL |
| 4 | B | 5 | B |
| 5 | B | 6 | B |
| 6 | B | 7 | B |
| 7 | B | NULL | NULL |
| 8 | X | 9 | X |
| 9 | X | NULL | NULL |
| 10 | A | 11 | A |
| 11 | A | 12 | A |
| 12 | A | NULL | NULL |
| 13 | X | 14 | X |
| 14 | X | NULL | NULL |
| 15 | B | NULL | NULL |
| 16 | C | 17 | C |
| 17 | C | NULL | NULL |
| 18 | X | NULL | NULL |
| 19 | A | 20 | A |
| 20 | A | NULL | NULL |
+----+------+------+------+
Now we know our ids that should be updated.
But we cannot run:
update updateTable set name = null where id in (
select t2.id
from updateTable t1
left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
where t2.id is not null
);
because we will receive error:
ERROR 1093 (HY000): You can't specify target table 'updateTable' for update in FROM clause
but we can avoid this bug by using temporary table for ids:
create temporary table updateTableTmp (
id int,
primary key (id)
) engine=innodb;
insert into updateTableTmp
select t2.id
from updateTable t1
left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
where t2.id is not null
;
update updateTable set name = null where id in (
select id from updateTableTmp
);
select * from updateTable;
+----+------+
| id | name |
+----+------+
| 1 | C |
| 2 | A |
| 3 | NULL |
| 4 | B |
| 5 | NULL |
| 6 | NULL |
| 7 | NULL |
| 8 | X |
| 9 | NULL |
| 10 | A |
| 11 | NULL |
| 12 | NULL |
| 13 | X |
| 14 | NULL |
| 15 | B |
| 16 | C |
| 17 | NULL |
| 18 | X |
| 19 | A |
| 20 | NULL |
+----+------+
Upvotes: 1
Reputation: 124
I have very simple and straight forward solution.
Just get the first record of each, ignore it in update. The query will looks like this:
UPDATE table SET name = NULL WHERE id NOT IN ( SELECT id FROM table WHERE name = 'A' LIMIT 1 )
Upvotes: 0