MISNole
MISNole

Reputation: 1062

Find Records Where One Column Contains Text of Another

I have a table that contains a county name and another column for email address. I want to find all records where the county name is contained in the email address. I've tried LIKE and CHARINDEX but I have not been able to get this to work as expected yet - even using other examples I've found here on this site.

Sample Data:

UserID  email                            County
15854   [email protected]                    POLK
15018   [email protected]             ADAIR
15020   [email protected]             ADAIR
15022   [email protected]             ADAIR
15024   [email protected]  ADAMS
15026   [email protected]                ADAMS
15028   [email protected]             ALLAMAKEE
15030   [email protected]        ALLAMAKEE
15032   [email protected]             ALLAMAKEE
15034   [email protected]         APPANOOSE

Different variations I have tried include:

SELECT UserID , email , County
FROM   #CountyRecorders
--WHERE email LIKE '%' + County + '%';
--WHERE County LIKE '%' + email + '%';
--WHERE CHARINDEX(email,County) > 0;
--WHERE CHARINDEX(County,email) > 0;

How can I accomplish this desired result set? Thanks,

Upvotes: 0

Views: 37

Answers (3)

Alan Burstein
Alan Burstein

Reputation: 7918

Both answers here will work - I have actually seen better performance using CHARINDEX in the WHERE clause for this type of thing but it's close. The way I would approach this, for performance reasons, is to use a computed column and index it. This way you only perform the calculation once. Using @Shnugo 's sample data (turned into a perm temp table for testing):

-- Table DDL and sample data
CREATE TABLE #tbl 
(
  UserID INT,
  email  VARCHAR(100),
  County VARCHAR(100),
  CountyInEmail AS CAST(SIGN(CHARINDEX(County,email)) AS BIT) PERSISTED
);
INSERT INTO #tbl VALUES
 (15854,'[email protected]                    ','POLK')
,(15018,'[email protected]             ','ADAIR')
,(15020,'[email protected]             ','ADAIR')
,(15022,'[email protected]             ','ADAIR')
,(15024,'[email protected]  ','ADAMS')
,(15026,'[email protected]                ','ADAMS')
,(15028,'[email protected]             ','ALLAMAKEE')
,(15030,'[email protected]        ','ALLAMAKEE')
,(15032,'[email protected]             ','ALLAMAKEE')
,(15034,'[email protected]         ','APPANOOSE');

-- Index on the "CountyInEmail" field
CREATE NONCLUSTERED INDEX nc_County ON #tbl(CountyInEmail)
INCLUDE (UserID, email, County)
;

-- Note the execution plan here (nonclustered index seek = good)
SELECT t.UserID, t.email, t.County, t.CountyInEmail
FROM   #tbl AS t
WHERE  t.CountyInEmail = 1;

Upvotes: 1

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

You did not state the expected result, but - if I get this right - one of your approaches should return correctly. This works for me:

DECLARE @tbl TABLE(UserID INT,  email VARCHAR(100),County VARCHAR(100));
INSERT INTO @tbl VALUES
 (15854,'[email protected]                    ','POLK')
,(15018,'[email protected]             ','ADAIR')
,(15020,'[email protected]             ','ADAIR')
,(15022,'[email protected]             ','ADAIR')
,(15024,'[email protected]  ','ADAMS')
,(15026,'[email protected]                ','ADAMS')
,(15028,'[email protected]             ','ALLAMAKEE')
,(15030,'[email protected]        ','ALLAMAKEE')
,(15032,'[email protected]             ','ALLAMAKEE')
,(15034,'[email protected]         ','APPANOOSE');

SELECT *
FROM @tbl 
WHERE eMail LIKE '%' + County + '%';

All rows are returned - except the "test" and the "frontier.com" row.

What might disturb: If your columns are under a case sensitive collation the capitalized ADAIR is not the same as adair. You can try either to use LOWER on both sides or to apply a insensitive collation with COLLATE

Upvotes: 1

jkramer
jkramer

Reputation: 91

You should just be able to use CHARINDEX for this Do

Select UserId, email, County
FROM #CountyRecorders
WHERE CHARINDEX(County,email) > 0

your casing matters here though

Upvotes: 1

Related Questions