Reputation: 906
I'm not sure if I worded the title properly so I apologize. I feel this is best explained by showing my data.
Address 1 Address 2 City State AddressInfo#
-------------------------------- ------------------ ------------ ----- --------------
1 Main St #100 Burbville, CA, 99999 1 Main St #100 Burbville CA 1001
1 Main St #100 Burbville, CA, 99999 1 Main St Burbville CA 1001
1 Main St #100 Burbville, CA, 99999 1 Main st Burbville CA 1001
...
4 Old Ave Ste 401 Southtown, OH, 44444 4 Old Ave Ste 401 Southtown OH 1004
4 Old Ave Ste 401 Southtown, OH, 44444 4 Old Ave Ste 401 Southtown OH 1004
...
8 New Blvd #800 NewCity, MT, 88888 8 New Blvd #800 NewCity MT 1008
8 New Blvd #800 NewCity, MT, 88888 8 New Blvd NewCity MT 1008
8 New Blvd #800 NewCity, MT, 88888 8 New Blvd NewCity MT 1008
I would like to find a way to remove all records where Address 2 is missing the full street address or simply contains an exact duplicate like AddressInfo# 1004.
Expected Output:
Address 1 Address 2 City State AddressInfo#
-------------------------------- ------------------ ------------ ----- --------------
1 Main St #100 Burbville, CA, 99999 1 Main St #100 Burbville CA 1001
...
4 Old Ave Ste 401 Southtown, OH, 44444 4 Old Ave Ste 401 Southtown OH 1004
...
8 New Blvd #800 NewCity, MT, 88888 8 New Blvd #800 NewCity MT 1008
Upvotes: 0
Views: 97
Reputation: 12857
Some form of:
UPDATE A
SET Address2 = CASE WHEN Address1 = Address2 THEN NULL ELSE
CASE WHEN CHARINDEX(',',Address2,CHARINDEX(',',Address2)) = 0 THEN NULL ELSE Address2 END
END
FROM Address AS A
Upvotes: 0
Reputation: 21
SELECT DISTINCT
Address1
, Address2
, [AddressInfo#]
, City
, State
-- + any other fields
FROM dbo.Table1 AS t
WHERE NOT EXISTS (
SELECT *
FROM dbo.Table1 AS x
WHERE x.Address1 = t.Address1
-- + any other criteria for "uniqueness"
AND LEFT( x.Address2, LEN( t.Address2 ) ) = t.Address2
AND LEN( x.Address2 ) > LEN( t.Address2 )
);
This query will first get all the rows where there is not another row with the same Address1 and an Address2 matching the current value up to the length of the field, but at least one character longer. The DISTINCT is then applied to eliminate exact duplicates. (This assumes no NULL values.)
A similar query could use the LIKE operator, but this would need to account for special characters in the data, such as "%", "_", or brackets.
Upvotes: 0
Reputation: 45096
This may have syntax errors but this is a valid approach
with cte as
(
select address1, address2, city, state, ROW_NUMBER() OVER(partition by AddressInfo# order by len(address2) desc) as 'alen'
)
select * from cte
where alen = 1
Upvotes: 0
Reputation: 9201
You could rebuild your data into a new table using
select
address_1,max(address_2) as address_2, addressinfo
from
table1
group by address_1,addressinfo
http://sqlfiddle.com/#!6/3d22c/2
Edit 1:
To select city
and state
as well you need to include it as a group by
expression:
select
address_1,max(address_2) as address_2, addressinfo,
city, state
from
table1
group by address_1,addressinfo, city, state
http://sqlfiddle.com/#!6/4527c/1
Edit 2:
The max
function does deliver the longest value here as needed. This works if the shorter values are true starts of the longer values.
Here is an example of this: http://sqlfiddle.com/#!6/3fba8/1
Upvotes: 1