Reputation: 4325
Let's say I have a table with millions of rows in which I have 3 integral variables: x,y and z against which I do my searching in a SELECT... WHERE x=a and y=b and z=c
Which would be faster / more efficient?
combining the 3 fields into a separate string column "x_y_z" (e.g. 1231_3242_6864) and indexing it
Making a 3 column index against the three integers?
Upvotes: 3
Views: 2163
Reputation: 699
There are a few other questions that are relevant to the discussion at hand:
Indexing Performance BigInt vs VarChar
Is there a REAL performance difference between INT and VARCHAR primary keys?
When should I use a composite index?
From the question about a composite index, one of the answers seems to indicate that if your x, y, and z tend to be unique among themselves (i.e. most x values are different from each other, and so on), you won't get much of a performance boost anyway. But if they do, the composite index seems to be the way to go.
Upvotes: 0
Reputation: 89661
If you could have covering indexes, and all three numbers are always provided so there is no concern about the order in the index (note this is also an issue in the stringified version), I would use a composite index of the three ints.
The three ints will take up less space, making more rows fit per page, which generally makes the index more efficient when being read. You also have a lot more headroom in that integer compared to a five bytes string for 99999 (four bytes for +/- ~2^31).
The magnitude of efficiency is going to be hard to judge in the database query, but remember you are also going to have to assemble and populate these. Not sure how or where you are planning to do that in MySQL - SQL Server has persisted computed columns which might be a good design choice if you were to commit to the string version.
Certainly one wouldn't want to convert integers into strings and concatenate them on the fly before performing a join.
Upvotes: 1
Reputation: 6911
I think that you would really need to benchmark it yourself (there are surely other factors that will determine the performance of your specific queries on your specific data), but in addition to the idea of using a composite 3 column index proposed by @mb14. You could try this:
Create a new column on your table like this:
xyzcomposite BINARY(16)
Then, create your index on this column.
On inserts, you would need to do an additional step to concat to your string "x_y_z" and then insert like this:
INSERT INTO yourtable (...,xyzcomposite) VALUES (...,UNHEX(MD5('the_xyz_concat')));
Of course you would also need to run the hash algorithm before your select statement as well.
set @xyz = UNHEX(MD5('x_y_z'));
select * from yourtable where xyzcomposite = @xyz
I'm not sure if the sum total of this overhead is worth the performance gain you may see in having to index only this one column vs. a three column index. Like I said you would have to benchmark it against your table and your data.
EDIT: the advantage of this approach is that it would work for x, y and z numbers of any size.
Upvotes: 0
Reputation: 22596
No it would be worst, string comparaison are much slower. You could eventually (if really needed, I wouldn't recommend it) combine the 3 integers in one integer but only IF THEY FIT.
However, to solve your index, problem the easiest would be to create a composite index on x,y and z.
Upvotes: 2