Reputation: 13121
Let's say you have a database with a single table like...
---------------------------------------------
| Name | FavoriteFood |
---------------------------------------------
| Alice | Pizza |
| Mark | Sushi |
| Jack | Pizza |
---------------------------------------------
Would it be more space-efficient to have an additional table called "Strings" that stores strings, and change the FavoriteFood column to an index in the string table. In the above example, "Pizza" looks like it is stored twice, but with the additional table, it would appear to be stored only once. Of course, please assume there are 1,000,000 rows and 1,000 unique strings instead of just 3 rows and 2 unique strings.
Edit: We don't know what the FavoriteFoods are beforehand: they are user-supplied. The programmatic interface to the string table would be something like...
String GetString(int ID) { return String at with Row-ID == ID }
int GetID(String s) {
if s exists, return row-id;
else {
Create new row;
return new row id;
}
}
So the string-table seems more efficient, but do modern databases already do that in the background, so I can just do the simple one table approach and be efficient?
Upvotes: 4
Views: 2194
Reputation: 6678
Pros for having a separate "Strings" table:
Cons:
But actually this is really effecient. E.g. most of full-text search engines use nearly this approach to store document-word maps.
Upvotes: 1
Reputation: 300559
You should be thinking in terms of what makes a good design in terms of your problem domain rather than efficiency (unless you expect to have tens of millions+ rows).
A well designed database should be in 3NF (third normal form). Only denormalise when you have identified a performance problem by measuring.
Upvotes: 4
Reputation: 717
In case you have another table to store the strings, it will be easier when you want to update the descriptions, for example, if u need to update all Pizzas to Italian Pizza, then u can do with one row update if u use a separate table. Another advantage would be translations, u can use the other table to store translations of the string in different languages and select the one based on the current language.
But the problem with that approach would be for inserts. U need to insert in both tables and also need to maintain the foreign key constraints, so it adds a bit of complexity to a simple table.
Upvotes: 2
Reputation: 76670
What are you measuring efficiency by? Assuming there is no other data associated with each FavoriteFood (in which case obviously you want two tables), a one-table approach is probably more time efficient, as the unnecessary join would incur an extra processing cost. On the other hand, a two-table approach may be more space-efficient, since it takes less space to store an index than a string, but that depends on how the particular database that you're using optimizes storage of repeated strings.
Upvotes: 5