mrt
mrt

Reputation: 1739

Fast SQL Server search on 40M text records

I have a SQL Server 2005 database with a table containing 40 million records. Each record contains a column that stores a comma separated list of keywords. Each keyword is a combination of letters and numbers. The keywords are up to 7 characters long and on average there are 15 keywords per record. The keywords are not unique across rows.

I want to search on full or part of the keyword.

I've created Full text index which shows 328,245,708 unique key count. The search efficiency is fine for queries of 4 or more characters (around 100ms on the test machine), but too slow for queries that have 3 or less characters (up to 3s on the test machine).

I've been trying both CONTAINSTABLE and CONTAINS queries of a sort '[query]*' with similar result.

I believe the performance of the short queries is slower, because short words repeat across different records more frequently.

Sorting the results is not crucial, and I've been trying to return TOP X results sorted on Rank from CONTAINSTABLE. This doesn't provide the desired performance.

How can I make this search faster for short queries?

Upvotes: 2

Views: 4521

Answers (2)

Nathan
Nathan

Reputation: 503

Another option would be to consider normalizing the table structure so that you would have your data records in one table, your tags in another table, and an associative table to relate the tags to the data records. This would allow you to perform the text search against the tags and simply join on the associative table to retrieve the relevant records.

Upvotes: 3

hd1
hd1

Reputation: 34657

You might want to gateway it to something like SOLR or Sphinx and let a dedicated text-search engine handle the search functionality.

Upvotes: 1

Related Questions