Reputation: 131
I have the following simplified table 'places'
, which contains 200,000+ rows:
placeId INT(10)
placeName VARCHAR (30)
placeNameEnglish VARCHAR (30)
placeName
is a place name stored in the original language e.g. RhôneplaceNameEnglish
is a place name translated into english e.g. RhoneCurrently I have two single column indexes - one for placeName
and one for placeNameEnglish
and am conducting these LIKE
pattern queries:
$testStr = 'rho';
SELECT placeId
FROM places
WHERE (placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%')
Done some research but can't quite get my head around multi-column indexes when used in this scenario. Question is, should I combine placeName
and placeNameEnglish
into a multi-column index or leave them as separate indexes?
Working on implementing the last suggested by @Gordon Linoff.
Considering adding a table named translations
instead of placeNames
so that the same index can be used for multiple tables i.e a persons
table that requires the same LIKE 'abc%'
matching.
So far:
transId INT
parentId INT
- either placeId
or personId
parentTypeId TINYINT
- either 1
to identify the places
table or 2
for the persons
table, etc (more tables could use this system at a later date)
languageId INT
transName VARCHAR
Should I also index the parentTypeId
to accommodate the extra WHERE
condition required to identify the correct parent table?
e.g. WHERE transName LIKE 'abc%' AND parentTypeId = 1
I imagine mysql works like this: it first uses the index for transName
to match against transName LIKE 'abc%'
, then it filters the results using parentTypeId = 1
Upvotes: 0
Views: 87
Reputation: 142208
For your original question: Two separate INDEXes. But... You are working too hard:
For European place names, you don't need to search both columns. The case folding and accent insensitivity of utf8_unicode_ci
(or virtually any collation other than utf8_bin
) will do what you need:
mysql> SELECT 'Rhône' LIKE '%rho%', 'Rhône' LIKE '%xyz%';
+-----------------------+-----------------------+
| 'Rhône' LIKE '%rho%' | 'Rhône' LIKE '%xyz%' |
+-----------------------+-----------------------+
| 1 | 0 |
+-----------------------+-----------------------+
Edit Based on OP's comment, this is not a complete solution.
Upvotes: 0
Reputation: 1269443
For this query:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%';
MySQL could use two indexes, one on places(placeName)
and one on places(placeNameEnglish)
. The operation is a called index merge (see here). I wouldn't count on it. This query cannot fully use a composite index.
You can rephrase the query as:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION
SELECT placeId
FROM places
WHERE placeNameEnglish LIKE '$testStr%';
or:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION ALL
SELECT placeId
FROM places
WHERE placeId NOT IN (SELECT placeId FROM places WHERE placename LIKE '$testStr%') AND
placeNameEnglish LIKE '$testStr%';
These can take advantage of the two indexes.
My recommendation, though, is to change the structure of your data. Have a table called PlaceNames
(or something like that) with these columns:
placeNameId INT
placeId INT,
languageId INT,
placeName VARCHAR(255)
That is, have a separate row for each language. Your query can then easily take advantage of an index on placeName(placeName)
.
Upvotes: 1