Eduardo Rascon
Eduardo Rascon

Reputation: 703

Select records with a substring from another table

I have this two tables:

data    
id   |email    
_   
1    |[email protected]    
2    |[email protected]    
3    |zzzgimail.com 

errors    
_    
error    |correct    
@gmial.com|@gmail.com    
gimail.com|@gmail.com    

How can I select from data all the records with an email error? Thanks.

Upvotes: 3

Views: 4146

Answers (4)

Zachary Scott
Zachary Scott

Reputation: 21172

select * from 
(select 1 as id, '[email protected]' as email union
 select 2 as id, '[email protected]' as email union
 select 3 as id, 'zzzgimail.com' as email) data join

(select '@gmial.com' as error, '@gmail.com' as correct union
 select 'gimail.com' as error, '@gmail.com' as correct ) errors

 on data.email like '%' + error + '%' 

I think ... that if you didn't use a wildcard at the beginning but anywhere after, it could benefit from an index. If you used a full text search, it could benefit too.

Upvotes: 0

AdaTheDev
AdaTheDev

Reputation: 147234

SELECT d.id, d.email
FROM data d
    INNER JOIN errors e ON d.email LIKE '%' + e.error

Would do it, however doing a LIKE with a wildcard at the start of the value being matched on will prevent an index from being used so you may see poor performance.

An optimal approach would be to define a computed column on the data table, that is the REVERSE of the email field and index it. This would turn the above query into a LIKE condition with the wildcard at the end like so:

SELECT d.id, d.email
FROM data d
    INNER JOIN errors e ON d.emailreversed LIKE REVERSE(e.error) + '%'

In this case, performance would be better as it would allow an index to be used.

I blogged a full write up on this approach a while ago here.

Upvotes: 1

Joe Stefanelli
Joe Stefanelli

Reputation: 135808

Assuming the error is always at the end of the string:

declare @data table (
    id int,
    email varchar(100)
)

insert into @data
    (id, email)
    select 1, '[email protected]' union all
    select 2, '[email protected]' union all
    select 3, 'zzzgimail.com'

declare @errors table (
    error varchar(100),
    correct varchar(100)
)

insert into @errors
    (error, correct)
    select '@gmial.com', '@gmail.com' union all
    select 'gimail.com', '@gmail.com'   

select d.id, 
       d.email, 
       isnull(replace(d.email, e.error, e.correct), d.email) as CorrectedEmail
    from @data d
        left join @errors e
            on right(d.email, LEN(e.error)) = e.error

Upvotes: 1

Dustin Laine
Dustin Laine

Reputation: 38503

Well, in reality you can't with the info you have provided.

In SQL you would need to maintain a table of "correct" domains. With that you could do a simple query to find non-matches.

You could use some "non" SQL functionality in SQL Server to do a regular expression check, however that kind of logic does not below in SQL (IMO).

Upvotes: 0

Related Questions