Peter
Peter

Reputation: 1

SQL Server VARCHAR(2) join performance vs. INT

I have a table of q-grams (2 letter strings). I need to join another table to this on the CHAR(2) field.

Would it be faster to convert these 2-grams into a number (e.g. aa=1, ab=2, ac=3) and convert the fields in both this table and the referencing table to an INT and join using an INT rather?

Kind regards

Peter

Upvotes: 0

Views: 1903

Answers (4)

Darryl Peterson
Darryl Peterson

Reputation: 2270

In my experience, changing the key for one join from a string (especially CHAR(2)) to an INT will not make a measurable difference. This is because most of your overhead is I/O instead of CPU.

Once you get into longer strings, (especially Unicode strings), you can start measuring a difference. But even then, I think it has more to do with increased I/O due to record size in the data and the indexes.

Upvotes: 0

Alex Brasetvik
Alex Brasetvik

Reputation: 11744

Will that be your bottle neck? Will it be worth the additional complexity?

That said, you might as well make it CHAR(2).

Upvotes: 0

Adriaan Stander
Adriaan Stander

Reputation: 166406

NO, this will probably not be more performant, and in addition, you will have to have an additional table to store these mappings.

This will be an additional JOIN aswell.

Just remember to apply the correct indexes for your join columns.

Upvotes: 1

Randy Minder
Randy Minder

Reputation: 48432

Probably not. And if you were, it would be better to use a smallint instead of an int. Also, I'm not sure it makes any sense to have a varchar(2) column. There is extra overhead in having varchar columns, and at most you are saving only a single byte.

How many rows of data will you have?

Upvotes: 1

Related Questions