Aniruddha Ghosh
Aniruddha Ghosh

Reputation: 83

Search for an email in an nvarchar column in SQL Server

I am looking to search for a particular email (e.g., [email protected]) in an nvarchar column in SQL Server. It should have been as easy as -

Select * from Discussion where Comments like '%[email protected]%'

However the above query is not covering one case. If the result set contains a Comment like "Comment from a user with email [email protected]". In this case the above query will return both records containing '[email protected]' and '[email protected]'.

I am looking for a sql query where the email address is matched exactly and the rest of text can be anything.

I am trying to search for an email in multiple tables. The column where we are searching for an email could be an xml, nvarchar(max) for comments and nvarchar(50) for just an email. We need to do perform another job on the record set.

Result set: If I am searching for '[email protected]', the result set should contain the below:

  1. lorem ipsum [email protected]
  2. [email protected] lorem ipsum
  3. Lorem ipsum [email protected] dolor sit amet

The result set should not contain something like:

  1. [email protected] dolor sit amet
  2. Lorem ipsum [email protected] dolor sit amet

Thanks in advance.

Upvotes: 0

Views: 357

Answers (4)

Aniruddha Ghosh
Aniruddha Ghosh

Reputation: 83

I came up with the below query.

declare @searchText nvarchar(50) = '[email protected]'

select * 
from Discussion 
where
    Comments = @searchText
    or (
        CHARINDEX(@searchText, Comments) > 0
        and SUBSTRING(Comments, CHARINDEX(@searchText, Comments) - 1, 1) in (' ', '<', '>', '"', '''', ',', ';', '=', '(', ')', '*') 
        and SUBSTRING(Comments, CHARINDEX(@searchText, Comments) + LEN(@searchText), 1) in (' ', '<', '>', '"', '''', ',', ';', '=', '(', ')', '*'))

I am looking for either an exact match or otherwise checking the previous and next characters to assess if the match is meeting the criteria.

Upvotes: 0

Dhrumil shah
Dhrumil shah

Reputation: 629

First of all, we need to find the email address contains in the text, so for that, we need to identify '@' so for that, we need to write a query like below

    SELECT Id,Comments as Text,        
                CASE
                    WHEN CHARINDEX('@',Comments) = 0 THEN NULL
                    ELSE SUBSTRING(Comments,beginningOfEmail,endOfEmail-beginningOfEmail)
                END email
         INTO #Temp1 FROM Discussion 
        CROSS APPLY (SELECT CHARINDEX(' ',Comments + ' ',CHARINDEX('@',Comments ))) AS A(endOfEmail)
        CROSS APPLY (SELECT DATALENGTH(Comments )/2 - CHARINDEX(' ',REVERSE(' ' + Comments),CHARINDEX('@',REVERSE(' ' + Comments ))) + 2) AS B(beginningOfEmail)
 -- stored data in Temp table , you can use alternative 

SELECT id,email  FROM #Temp1 
DROP table #Temp1

OUTPUT

id  email
1   [email protected]
2   [email protected]
3   [email protected]
4   [email protected]
5   [email protected]
6   [email protected]

Then we need to do one more filter in the #Temp table like

SELECT id,email  FROM #Temp1 
where LEFT(email, CHARINDEX('@', email + '@') -1) like 'fname'

OR

SELECT id,email  FROM #Temp1 
    where LEFT(email, CHARINDEX('@', email + '@') -1) = 'fname'

OUTPUT

id  email
1   [email protected]
2   [email protected]
4   [email protected]
5   [email protected]

You can copy and paste this query and check its working from my end,

Upvotes: 1

Antonio Balsamo
Antonio Balsamo

Reputation: 155

Maybe it could be useful to check for a non-alphanumeric character after or before the email (as described here: LIKE Transact-SQL) and distinguish all possible cases:

  1. Email is in the middle

  2. Email is at beginning

  3. Email is at the end

  4. Column only contains the email

    SELECT * 
      FROM Discussion 
     WHERE Comments LIKE '%[^a-z0-9][email protected][^a-z0-9]%' --CASE 1
        OR Comments LIKE '[email protected][^a-z0-9]%'           --CASE 2
        OR Comments LIKE '%[^a-z0-9][email protected]'           --CASE 3
        OR Comments = '[email protected]'                        --CASE 4
    

In addittion, in this way you can cover cases like "Lorem ipsum:[email protected]" that you probably would like to detect and it should work ad you described.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271231

If you can assume that the email is at the beginning, end, or surrounded by spaces, then you can use pad the pattern and the comments with spaces:

where concat(' ', Comments, ' ') like '% [email protected] %'

If there can be other separators -- such as parentheses, syntax, and so on -- then this is tricker. One method is to replace these with spaces using translate():

where concat(' ', translate(Comments, '(),:;', '     '), ' ') like '% [email protected] %'

Unfortunately, . is part of the pattern you are looking for, so it is tricky to include that as well.

Upvotes: 0

Related Questions