PepperoniPizza
PepperoniPizza

Reputation: 9102

Postgres lowercase column and delete duplicates

I have the following table:

Customers
---------
name          text
object_id     integer
created_time  timestamp with time zone 

Indexes:
    "my_index" UNIQUE CONSTRAINT, btree (name, object_id, created_time)

The unique index works fine but then I ended up with duplicate data like:

Name  |  object_id  |  created_time
------------------------------------
john  | 1           | 2018-02-28 15:42:14.30573+00
JOHN  | 1           | 2018-02-28 15:42:14.30573+00

So I tried to lowercase all my data in the name column with:

UPDATE customers SET name=lower(name) WHERE name != LOWER(name);

But this procedure generated an error because now I would be violating the index:

ERROR:  duplicate key value violates unique constraint "my_index"
DETAIL:  Key (name, object_id, created_time)=(john, 1, 2018-02-28 15:42:14.30573+00) already exists.

What kind of procedure could I use to delete rows that after casting to lowercase generate an index violation ?

Upvotes: 2

Views: 429

Answers (1)

Jasen
Jasen

Reputation: 12392

If you have 'JOHN' and 'John' in you table but not 'john' it gets messy. here's one solution.

insert into customers
  select distinct lower("name") ,object_id,created_time from customers
    where name <> lower(name)
      and not (lower("name") ,object_id,created_time)
       in (select * from customers);

delete from customers where name <> lower(name);

after that consider:

alter table customers alter column name type citext;

Upvotes: 3

Related Questions