Reputation: 1727
I have a table with 692256 number of rows ,which has data similar to below
customer_type_id data
1 G.M (1)
1 GM (1)
1 FORD (K.G)
1 FORD(K.G)
1 Honda
2 GM (1)
2 OTHER
2 OTHER2
in the above case I have to remove duplicates based on customer_type_id and data columns. and the duplicates are not exactly duplicates . For example first two records are duplicates and the next two records are also duplicates . There could be some space or period or braces in data . So To get the duplicates, I have to run the below functions on data column
trim( replace(replace(replace(replace(replace(data,'.',''),'(',''),')',''),' ','') ,' ','') )
Now I want to remove one of the rows from the above case and keep the formatting of the other row .
In the above case I want to get
customer_type_id Data
1 G.M (1)
1 FORD(K.G)
1 Honda
2 GM (1)
2 OTHER
2 OTHER2
In fact any record of the duplicate value will be fine . But should not be like below ( in the below case data column value spaces and braces are removed from the existing value)
customer_type_id Data
1 GM1
1 FORDKG
1 Honda
2 GM (1)
2 OTHER
2 OTHER2
any help on this ?
Thanks a lot
Regards
Kiran
Upvotes: 1
Views: 160
Reputation: 7116
Add an AutoIncrement Key "ID"
delete
from table
where table.ID NOT IN (
select MAX(ID)
from table
group by customer_id, your_function(data) )
I guess it should work, please check/confirm syntax, I am not sure about that. Idea is to delete all the records except one distinct combination.
Upvotes: 1
Reputation: 3922
Maybe you could use LIKE in your query.
SELECT * FROM table WHERE data LIKE "%G%M%1%"
Where the % stands for litterally anything (nothing, a space, a character, etc).
I'm not sure this will be reliable though, as the above would also match "Grand Master 7331", but if you're sure those kinds of collisions would not happen it might work.
Upvotes: 0