Reputation: 1078
I am not a databases guy,but I have been given the "fun" job of cleaning up someone else's database. We have many duplicate record in our databases and some of customers are getting double or triple billed every month.
Given the following Database example :
Table: Customers
ID Name Phone DoNotBill
1 Acme Inc 5125551212 No
2 ABC LLC 7138221661 No
3 Big Inc 4132229807 No
4 Acme 5125551212 No
5 Tree Top 2127657654 No
Is it possible to write a query that Identifies the all duplicate phone numbers (in this case records 1 and 4) and then marks and duplicate records yes by updating the DoNotBill column. But leaves the first record unmarked.
In this example case we would be left with:
ID Name Phone DoNotBill
1 Acme Inc 5125551212 No
2 ABC LLC 7138221661 No
3 Big Inc 4132229807 No
4 Acme 5125551212 Yes
5 Tree Top 2127657654 No
Upvotes: 1
Views: 61
Reputation: 413
UPDATE customers c SET c.DoNotBill="Yes";
UPDATE customers c
JOIN (
SELECT MIN( ID ) ID, Phone
FROM customers
GROUP BY Phone
) u ON c.ID = u.ID AND c.Phone = u.Phone
SET c.DoNotBill="No";
That way not only duplicates are eliminated, but all multiple entries are dealt with.
Upvotes: -1
Reputation: 25842
something like this?
UPDATE
customers cust,
(SELECT
c1.ID,
c1.name,
c1.phone,
c1.DoNotBill
FROM customers c
LEFT JOIN
(SELECT
cc.ID
FROM customers cc
) as c1 on c1.phone = c.phone
) dup
SET cust.DoNotBill = 'Yes' WHERE cust.id=dup.id ;
Upvotes: 1
Reputation: 6082
This depends on ur data amount You can do it in steps and make use some tools like excel...
This qrt
SELECT a.id,b.id,a.phone FROM clients a , clients b WHERE
A.phone =b.phone
And a.id!=b.id
The result is all duplicated records. Add
Group by a.phone
And u will get 1 record for each 2 duplicates. if you like the records and they are whT u need. ChNge select to select a.id and Use this qry as subqry to an update sql statement
UPDATE clients SET billing='no' WHERE id IN ( sql goes here)
Upvotes: 0
Reputation: 605
To begin with I assume that the DoNotBill column only has two possible values; yes and no. In that case it should be bool instead of varchar, meaning it would be either true or false.
Furthermore I don't get the meaning of the DoNotBill column. Why wouldn't you just use something like this?
select distinct phone from customers
That would give you the phone numbers without duplicates and without the need for an extra column.
Upvotes: 1