Mike Marks
Mike Marks

Reputation: 10139

T-SQL selecting records where a field contains a certain keyword from another table

I'm trying to run the below SQL to return records where the "a.PatientCommentText" contains a "bad keyword" as indicated by "b.Keyword" below. I do not think this is returning everything because the below query only returns 253 records, and the inverse of this (where charindex(b.Keyword, a.PatientCommentText) = 0 returns 7,378 records, but the total count of PES_Data is 9,821. I would expect the sum of the two scenarios to equal the record count of PES_Data. What am I doing wrong?

Returns 253 records

select a.* from PES_Data a
inner join Bad_Keywords b on 1=1
where charindex(b.Keyword, a.PatientCommentText) <> 0

Returns 7,378 records

select a.* from PES_Data a
inner join Bad_Keywords b on 1=1
where charindex(b.Keyword, a.PatientCommentText) = 0

But below returns 9,821 records

select a.* from PES_Data a

Upvotes: 0

Views: 43

Answers (2)

user3623480
user3623480

Reputation:

+1 to Gordon Linoff's answer.

DECLARE @tblA TABLE(Base VARCHAR(100));
DECLARE @tblB TABLE(Reference VARCHAR(100));

-- Total 7 rows
INSERT INTO @tblA VALUES   
('abcdaa')   
,('bcdeab') 
,('cdefbb')   
,('defgbc')  
,('efghcc')
,('fghddd')
,(NULL)

select count(*) from @tblA where CHARINDEX('aa',Base) <> 0        -- Returns 1  
select count(*) from @tblA where CHARINDEX('aa',Base) = 0         -- Returns 5
select count(*) from @tblA where CHARINDEX('aa',Base) IS NULL     -- Returns 1

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269683

You would appear to have some PatientCommentText values that are NULL.

Try adding this to your results:

select pd.*
from PES_Data pd
where pd.PatientCommentText is null;

In addition, you may be getting duplicates, for comments that contain more than one "bad word". I wouldn't expect the numbers to add up, unless you know there are no duplicates.

Upvotes: 3

Related Questions