Reputation: 362
Optimize for speed.
For a catalog, I have about 150K of items: Each item record has 8 search fields and one Json data string. (No: int; Search fields are max 30 characters, JSON min 200 : max 2000 characters)
Item No is a unique field and is PK all others fields are not unique.
No Search1 Search2 ... Search8 JSON_datastring
1 a1 a2 ... a8 {...json...}
2 b1 b2 ... b8 {...json...}
..
x a1 b2 ... c8 {...json...}
A user can search on each of the eight fields. He can search on the whole field,
WHERE Search1 = 'x'
the start of the field
WHERE Search1 LIKE 'x%'
or any part of the field
WHERE Search1 LIKE '%x%'
My first approach is to keep all the data in on table and put an index on each of the search colomns; but I'm wondering if there are faster methods.
The goal here is to get all matching JSON_datastrings as fast as possible. 'Select' speed is the only concern here (users will not make any update/insert/delete commands). Memory or disk space are hardly relevant for this project. The catalog needs updating only once a week, so import and update time is also not a big cocern.
I'm working with a MYSQL database. The data in the table is imported from a csv file each week. I have full control over the MYSQL database.
I expect about 60% of the searches to be on the whole field. Current n° of records 150K with about 20K growth per year. Concurrent users: we don't have an estimate at the moment.
Are there any suggestions on how to improve the search speed? I'm still in the design phase of the project.
Upvotes: 1
Views: 236
Reputation: 26699
LIKE %x%
will always require a full table scan. No matter the index used, you can't optimize this. Instead, you should set a FULLTEXT
search for the field - requiring FULLTEXT
index and modifying the query to use MATCH ... AGAINST
instead of LIKE
. Without knowing the exact data is hard to offer further improvements, but generally you may want to offload MySQL from the search, moving the search to a separate service (Lucene/Solr
or Sphinx
). They may be more suitable not only in terms of performance, but in terms of functionality, as they are built for searching.
Upvotes: 1