Reputation: 3961
I have a table that I've created a Full Text Catalog on. The table has just over 6000 rows. I've added two columns to the index. The first could be considered a unique identifier of sorts and the second could be considered the content for that item (there are 11 other columns in my table that aren't part of the Full Text Catalog). Here is an example of a couple of rows:
TABLE: data_variables
ROW unique_id label
1 A100d1 Personal preference of online shopping sites
2 A100d2 Shopping behaviors for adults in household
In my web application on the front end, I have a text box that the user can type into to get a list of items that match whatever terms they're searching for in the UNIQUE ID
or LABEL
columns. So, for example, if the user typed in sho
or a100
then a list would be populated with both of the rows above. If they typed in behav
then a list would be populated with only row 2 above.
This is done via an Ajax request on each keyup
. PHP calls a Stored Procedure on the SQL server that looks like:
SELECT TOP 50 dv.id, dv.id + ': ' + dv.label,
dv.type_id, dv.grouping, dv.friendly_label
FROM data_variables dv
WHERE (CONTAINS((dv.unique_id, dv.label), @search))
(@search
is the text from the user that is passed into the Stored Procedure.)
I've noticed that this gets pretty sluggish, especially when I wasn't using TOP 50
in the query.
What I'm looking for is a way to speed this up either directly on the SQL Server or by abandoning the full-text indexing idea and using jQuery to search through an array of the searchable items on the client-side. I've looked a bit into the jQuery AutoComplete stuff and some other jQuery plugins for AutoComplete, but haven't yet tried to mock up anything. That would be my next step, but I wanted to check here first to see what advice I would get.
Thanks in advance.
Upvotes: 11
Views: 3543
Reputation: 1
If you really need performance ..you may want to look at; FTS3 and FTS4 ...
snip... from another forum...
For example, if each of the 517430 documents in the "Enron E-Mail Dataset" is inserted into both an FTS table and an ordinary SQLite table created using the following SQL script:
Code: CREATE VIRTUAL TABLE enrondata1 USING fts3(content TEXT); /* FTS3 table / CREATE TABLE enrondata2(content TEXT); / Ordinary table */ Then either of the two queries below may be executed to find the number of documents in the database that contain the word "linux" (351). Using one desktop PC hardware configuration, the query on the FTS3 table returns in approximately 0.03 seconds, versus 22.5 for querying the ordinary table.
see...
http://www.sqlite.org/fts3.html
Upvotes: 0
Reputation: 14128
I would advise against a LIKE, unless you're using a linear index (left-to-right) and you're doing queries like LIKE 'work%'
. If you're doing something like LIKE '%word%'
a regular index isn't going to help you. You typically want to use a Full-Text index when you want to search for words inside a paragraph.
With a lot of data, typically the built-in Full-Text engines in databases aren't very stealer. For the best performance you typically have to go with an external solution that is built specifically for Full-Text.
Some options are Sphinx, Solr, and elasticsearch, just to name a few. I wouldn't say that any of these options are better than the other. There are definitely pros and cons to consider:
The best thing you can do is benchmark these solutions against your existing data. Testing each and every individual component (unit testing) can help you identify the real problems and help you find good solutions.
Upvotes: 6
Reputation: 568
I had the same problem and went for the LIKE solution. I found too that the or operator to be too taxing and divide the query into two selects with an union all (fastest, and in my scenario it was impossible to find the same text in the index column and the data).
Yours will be like
SELECT TOP 50 from (
select dv.id, dv.id + ': ' + dv.label,
dv.type_id, dv.grouping, dv.friendly_label
FROM data_variables dv
WHERE dv.unique_id like '%'+@search+'%'
UNION ALL
select dv.id, dv.id + ': ' + dv.label,
dv.type_id, dv.grouping, dv.friendly_label
FROM data_variables dv
WHERE dv.label like '%'+@search+'%'
)
Oh!! And test the performance in SQL Server, not the web!
Upvotes: 5
Reputation: 101
If You plan to increase amount of data it will be best way to use reverse index for full-text searching.
Look at Apache Solr - best fulltext search engine at this moment.
You can simply periodically index Your database data and use solr as search-engine, it provide simple ajax api and can be queried directly from frontend.
Upvotes: 2
Reputation: 1714
Several suggestions, based around the fact that you have only 6000 rows, so the database should eat this alive.
A. Try using Like operator, just in case it helps. Not expecting it too, but pretty trivial to try. There is something else going on here overall for you to detect this is slow given these small volumes.
B. can you cache queries in advance? With 6000 rows, there are probably only 36*36 combinations of 2 character queries, which should take virtually no memory and save the database any work.
C. Moving the selection out to the client is a good idea, depends on how big the 6000 rows are overall, vs network latency for individual lookups.
D. Combining b and c will give you really good performance I suspect, but with some coding effort required. If the server maintains a list of all single character results in cache, and clients download the letter cache set after first keystroke, then they potentially have a subset of all rows, but won't need to do more network IO for additional keystrokes.
Upvotes: 6