Reputation: 1
I have a table of q-grams (2 letter strings). I need to join another table to this on the CHAR(2) field.
Would it be faster to convert these 2-grams into a number (e.g. aa=1, ab=2, ac=3) and convert the fields in both this table and the referencing table to an INT and join using an INT rather?
Kind regards
Peter
Upvotes: 0
Views: 1903
Reputation: 2270
In my experience, changing the key for one join from a string (especially CHAR(2)) to an INT will not make a measurable difference. This is because most of your overhead is I/O instead of CPU.
Once you get into longer strings, (especially Unicode strings), you can start measuring a difference. But even then, I think it has more to do with increased I/O due to record size in the data and the indexes.
Upvotes: 0
Reputation: 11744
Will that be your bottle neck? Will it be worth the additional complexity?
That said, you might as well make it CHAR(2)
.
Upvotes: 0
Reputation: 166406
NO, this will probably not be more performant, and in addition, you will have to have an additional table to store these mappings.
This will be an additional JOIN aswell.
Just remember to apply the correct indexes for your join columns.
Upvotes: 1
Reputation: 48432
Probably not. And if you were, it would be better to use a smallint instead of an int. Also, I'm not sure it makes any sense to have a varchar(2) column. There is extra overhead in having varchar columns, and at most you are saving only a single byte.
How many rows of data will you have?
Upvotes: 1