BadHorsie
BadHorsie

Reputation: 14544

MySQL - Update rows where value in column is the same as previous row

I'm trying to clean up some incorrect data:

id   |   name
-------------
1    |   C
2    |   A
3    |   A
4    |   B
5    |   B
6    |   B
7    |   B
8    |   X
9    |   X
10   |   A   
11   |   A      
12   |   A
13   |   X
14   |   X
15   |   B
16   |   C
17   |   C
18   |   X
19   |   A
20   |   A

What has happened is when the data has been entered, if the name field was NULL, the value from the previous iteration of a loop has not been cleared so it has been entered into the next row.

The data should look like this:

id   |   name
-------------
1    |   C
2    |   A
3    |   NULL
4    |   B
5    |   NULL
6    |   NULL
7    |   NULL
8    |   X
9    |   NULL
10   |   A   
11   |   NULL     
12   |   NULL
13   |   X
14   |   NULL
15   |   B
16   |   C
17   |   NULL
18   |   X
19   |   A
20   |   NULL

Is there a way I can update the entire table in one swoop by setting all duplicates like this to NULL, while preserving the values where the column had an intended value?

Upvotes: 1

Views: 215

Answers (3)

Mihai
Mihai

Reputation: 26784

User variables to mimic a row number per each name.Tested on my machine

SELECT @var:=name,@no:=0 FROM t ORDER BY id;
UPDATE t join
(select ID,NAME,(CASE WHEN NAME=@var THEN @no:=@no+1 ELSE @no:=1 AND @var:=NAME END) BLAH
from T
order by ID)X
on T.ID=X.ID
SET T.NAME= NULL
where X.BLAH<>0

Upvotes: 0

cn0047
cn0047

Reputation: 17061

We need to join our duplicates into chain, it easy to reach by:

select *
from updateTable t1
left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
;
+----+------+------+------+
| id | name | id   | name |
+----+------+------+------+
|  1 | C    | NULL | NULL |
|  2 | A    |    3 | A    |
|  3 | A    | NULL | NULL |
|  4 | B    |    5 | B    |
|  5 | B    |    6 | B    |
|  6 | B    |    7 | B    |
|  7 | B    | NULL | NULL |
|  8 | X    |    9 | X    |
|  9 | X    | NULL | NULL |
| 10 | A    |   11 | A    |
| 11 | A    |   12 | A    |
| 12 | A    | NULL | NULL |
| 13 | X    |   14 | X    |
| 14 | X    | NULL | NULL |
| 15 | B    | NULL | NULL |
| 16 | C    |   17 | C    |
| 17 | C    | NULL | NULL |
| 18 | X    | NULL | NULL |
| 19 | A    |   20 | A    |
| 20 | A    | NULL | NULL |
+----+------+------+------+

Now we know our ids that should be updated.
But we cannot run:

update updateTable set name = null where id in (
    select t2.id
    from updateTable t1
    left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
    where t2.id is not null
);

because we will receive error:

ERROR 1093 (HY000): You can't specify target table 'updateTable' for update in FROM clause

but we can avoid this bug by using temporary table for ids:

create temporary table updateTableTmp (
    id int,
    primary key (id)
) engine=innodb;

insert into updateTableTmp
select t2.id
from updateTable t1
left join updateTable t2 on t1.name = t2.name and t1.id+1 = t2.id
where t2.id is not null
;

update updateTable set name = null where id in (
    select id from updateTableTmp
);

select * from updateTable;
+----+------+
| id | name |
+----+------+
|  1 | C    |
|  2 | A    |
|  3 | NULL |
|  4 | B    |
|  5 | NULL |
|  6 | NULL |
|  7 | NULL |
|  8 | X    |
|  9 | NULL |
| 10 | A    |
| 11 | NULL |
| 12 | NULL |
| 13 | X    |
| 14 | NULL |
| 15 | B    |
| 16 | C    |
| 17 | NULL |
| 18 | X    |
| 19 | A    |
| 20 | NULL |
+----+------+

Upvotes: 1

Daksh Mehta
Daksh Mehta

Reputation: 124

I have very simple and straight forward solution.

Just get the first record of each, ignore it in update. The query will looks like this:

UPDATE table SET name = NULL WHERE id NOT IN ( SELECT id FROM table WHERE name = 'A' LIMIT 1 )

Upvotes: 0

Related Questions