Reputation: 139
I have a search engine that build query dynamically to an SQL Server database and I'm having different results with these queries:
CONTAINS('goal AND of AND Chelsea');
AND
CONTAINS('goal') AND CONTAINS('of') AND CONTAINS('Chelsea')
The first one returns me results and the second nothing.
I used these commands for Full-Text Search engine accept noise words:
sp_configure 'show advanced options', 1
RECONFIGURE
GO
sp_configure 'transform noise words', 1
RECONFIGURE
GO
Now, the first sql command is working fine, but the second, that is the command my search engine generates, doesn't work.
Upvotes: 1
Views: 110
Reputation: 453807
"of" is a noise word.
I can reproduce your results with "transform noise words" on and inserting the following into a full text indexed table
INSERT INTO [dbo].[testtable]
([id],
[text])
VALUES (1,'goal'),
(2,'Chelsea'),
(3,'of'),
(4,'Drogba of Chelsea scored a goal')
After the index is populated
SELECT *
FROM sys.dm_fts_index_keywords(db_id('fts'), object_id('testtable'))
Returns
+--------------------------------+--------------+-----------+----------------+
| keyword | display_term | column_id | document_count |
+--------------------------------+--------------+-----------+----------------+
| 0x006300680065006C007300650061 | chelsea | 2 | 2 |
| 0x00640072006F006700620061 | drogba | 2 | 1 |
| 0x0067006F0061006C | goal | 2 | 2 |
| 0x00730063006F007200650064 | scored | 2 | 1 |
| 0xFF | END OF FILE | 2 | 4 |
+--------------------------------+--------------+-----------+----------------+
From which I conclude that this option does not affect what is indexed - simply how the queries are treated.
CONTAINS('goal AND of AND Chelsea');
becomes
CONTAINS('goal AND Chelsea');
after ignoring the noise word.
CONTAINS('of')
cannot be answered from the index and SQL Server clearly decides it is better to return false
than return true
for a load of rows that don't event contain that text.
I'm not sure why it doesn't evaluate as unknown
rather than false
though
WHERE NOT CONTAINS([text],'of')
returns all rows - including those containing the term!
If you can't alter your query to use the pattern that works the way you want then you will need to get around the issue by including the noise words in the full text index.
After running
ALTER FULLTEXT INDEX ON [dbo].[testtable] SET STOPLIST = OFF
and rebuilding the index it works the way you want.
Upvotes: 2