Reputation: 10139
I'm trying to run the below SQL to return records where the "a.PatientCommentText" contains a "bad keyword" as indicated by "b.Keyword" below. I do not think this is returning everything because the below query only returns 253 records, and the inverse of this (where charindex(b.Keyword, a.PatientCommentText) = 0 returns 7,378 records, but the total count of PES_Data is 9,821. I would expect the sum of the two scenarios to equal the record count of PES_Data. What am I doing wrong?
Returns 253 records
select a.* from PES_Data a
inner join Bad_Keywords b on 1=1
where charindex(b.Keyword, a.PatientCommentText) <> 0
Returns 7,378 records
select a.* from PES_Data a
inner join Bad_Keywords b on 1=1
where charindex(b.Keyword, a.PatientCommentText) = 0
But below returns 9,821 records
select a.* from PES_Data a
Upvotes: 0
Views: 43
Reputation:
+1 to Gordon Linoff's answer.
DECLARE @tblA TABLE(Base VARCHAR(100));
DECLARE @tblB TABLE(Reference VARCHAR(100));
-- Total 7 rows
INSERT INTO @tblA VALUES
('abcdaa')
,('bcdeab')
,('cdefbb')
,('defgbc')
,('efghcc')
,('fghddd')
,(NULL)
select count(*) from @tblA where CHARINDEX('aa',Base) <> 0 -- Returns 1
select count(*) from @tblA where CHARINDEX('aa',Base) = 0 -- Returns 5
select count(*) from @tblA where CHARINDEX('aa',Base) IS NULL -- Returns 1
Upvotes: 0
Reputation: 1269683
You would appear to have some PatientCommentText
values that are NULL
.
Try adding this to your results:
select pd.*
from PES_Data pd
where pd.PatientCommentText is null;
In addition, you may be getting duplicates, for comments that contain more than one "bad word". I wouldn't expect the numbers to add up, unless you know there are no duplicates.
Upvotes: 3