Näbil Y
Näbil Y

Reputation: 1650

MYSQL - int or short string?

I'm going to create a table which will have an amount of rows between 1000-20000, and I'm having fields that might repeat a lot... about 60% of the rows will have this value, where about each 50-100 have a shared value.

I've been concerned about efficiency lately and I'm wondering whether it would be better to store this string on each row (it would be between 8-20 characters) or to create another table and link them with its representative ID instead... So having ~1-50 rows in this table replacing about 300-5000 strings with ints?

Is this a good approach, or at all even neccessary?

Upvotes: 1

Views: 580

Answers (4)

Guffa
Guffa

Reputation: 700342

Yes, it's a good approach in most circumstances. It's called normalisation, and is mainly done for two reasons:

  • Removing repeated data
  • Avoiding repeating entities

I can't tell from your question what the reason would be in your case.

The difference between the two is that the first reuses values that just happen to look the same, while the second connects values that have the same meaning. The practical difference is in what should happen if a value changes, i.e. if the value changes for one record, should the value itself change so that it changes for all other records also using it, or should that record be connected to a new value so that the other records are left unchanged.

If it's for the first reason then you will save space in the database, but it will be more complicated to update records. If it's for the second reason you will not only save space, but you will also reduce the risk of inconsistency, as a value is only stored in one place.

Upvotes: 2

Markus Deibel
Markus Deibel

Reputation: 1329

Avitus is right, that it's generally a good practice to create lookups.

Think about the JOINS you will use this table in. 1000-20000 rows are not a lot to be handled by MySQL. If you don't have any, I would not bother about the lookups, just index the column.

BUT as soon as you start joining the table with others (of the same size) that's where the performance loss comes in, which you can (most likely) compensate by introducing lookups.

Upvotes: 1

Adam Plocher
Adam Plocher

Reputation: 14233

I would recommend using an int with a foreign key to a lookup table (like you describe in your second scenario). This will cause the index to be much smaller than indexing a VARCHAR so the storage required would be smaller. It should perform better, too.

Upvotes: 1

Avitus
Avitus

Reputation: 15958

That is a good approach to have a look-up table for the strings. That way you can build more efficient indexes on the integer values. It wouldn't be absolutely necessary but as a good practice I would do that.

Upvotes: 1

Related Questions