Radu Gheorghiu
Radu Gheorghiu

Reputation: 20509

Fulltext index search has large number of page reads

I have a full-text index on a column in a table that contains data like this:

searchColumn
90210 Brooks Diana Miami FL [email protected] 5612233395

The column is an aggregate of Zip, last name, first name, city, state, e-mail and phone number.

I use this column to search for a customer based on any of this possible information.

The issue I am worried about is with the high number of reads that occurs when doing a query on this column. The query I am using is:

declare @searchTerm varchar(100) = ' "FL" AND "90210*" AND "Diana*" AND "Brooks*" '

select *
from CustomerInformation c
where contains(c.searchColumn, @searchTerm)

Now, when running Profiler I can see that this search has about 50.000 page reads to return a single row, as opposed to when using a different approach using regular indexes and multiple variables, broken down like @firstName, @LastName, like below:

WHERE C.FirstName like coalesce(@FirstName + '%' , C.FirstName)
    AND C.LastName like coalesce(@LastName + '%' , C.LastName)
    etc.

Using this approach I get only around 140 page reads. I know the approaches are quite different, but I'm trying to understand why the full-text version has so much more reads and if there is any way I can bring that down to something closer to the numbers I get when using regular indexes.

Upvotes: 0

Views: 102

Answers (1)

bwilliamson
bwilliamson

Reputation: 391

I have a couple of thoughts on this. First the Select * will generate a great number of page reads because it has to pull all columns which may or may not be indexed. When you pull every column it most likely will not make use of the best Index plan out there.

As to your Where clauses, when using the @searchTerm and the value of "FL" AND "90210*" AND "Diana*" AND "Brooks*" it has to check the datapages multiple times each time it is run. Think of how you would look up this information if you had to do it. You look at a piece of paper with the info on it and see if the search column contains FL. Now does it contain FL and 90210*. Now does it contain both of those plus Diana...etc.

You can see why it would keep having to go back to the page to read over and over again. The second query only has to look at 2 columns narrowly defined.

If you want more information on this, I would suggest a class by Brent Ozar that is free right now. How to think like the SQL Server Engine

I hope that helps.

Upvotes: 1

Related Questions