Reputation: 993
Okay, I'm going to be dealing with about 1,000,000 items which will be shared by a limited number of stores. The number of stores are limited to about 5, but that could potentially change. The products table has about 60 to 70 fields already to manage.
These are three methods I am considering:
Implode the values into a small varchar, say about 20 characters long, with a delimiter can store about 10 stores to associate the product with. The MySQL select can use LIKE '%|2|%' to match any products with store ID 2.
There are numerous discussions on using MySQL Procedures, too.
Pros: One field, easy to accommodate more stores, less memory used?
Cons: Text search, takes longer?
It would be easiest and probably best to just create a TINYINT (1) and a new field for each. The problem is, if new stores are added, new fields will have to be added, altering millions of products.
Pros: Easy to select
Cons: Adding stores requires altering the table structure, more fields to manage - what if they have 20 stores down the line (future-proofing)
The standard response I would expect is to use a linking table to associate the products with the stores, just a two field table. My concern is if most products are used by most stores, then suddenly, there could be about 5 million rows in that table.
I could break the link tables out to reduce their row count, for example by first letter of the product name and have 26 link tables.
Pros: LEFT JOIN is easy to use
Cons: Almost every query would need to search through the 5 million link tables, unless broken out
I really should but together some tests to figure out the best response/processing times, but that would take some time in itsef. I'm interested to hear what your best solution would be for keeping gobs of data future-proof for more stores and stored with efficiently.
Upvotes: 2
Views: 4180
Reputation: 341
Hey have you considered storing the value and fetching them using bitwise operation... like assigning every store an id this way: 1, 2, 4, 8, 16, 32, 64, 128... just add one column in the item table itself to store single value like: if item exists in stores with id 1, 4, 32 the field will store 37 in and while retrieving you can use & operation straight in the query to see if item exists in a particular store or not like this:
select * from item where store_id&1;
select * from item where store_id&4;
select * from item where store_id&32;
Upvotes: -1
Reputation: 434585
You left out one very important consideration in your pro and con lists: referential integrity. A fast database that lets you do easy queries is useless if it is full of broken data, that just lets you make mistakes faster and making mistakes is the last thing that humans need help with. The only option that allows foreign keys (i.e. referential integrity) is option (3).
A linking table is also the standard way to deal with this sort of thing and databases are usually designed to handle the standard use cases quite well.
As far as (1) goes, doing a LIKE '%x%'
is going to do a table scan almost every time and a table scan is the last thing you ever want. You'd also have to make sure you have the |
delimiter at the beginning and end of the strings or you'll need three LIKEs (or a regular expression) instead of just one. Some databases can use indexes for LIKE 'x%'
but that doesn't apply to your case.
Approach (2) uses too many columns, your queries will be a mess and your tables will be too wide. You'd also have to worry about make sure each row in one table has a corresponding column in another table.
Upvotes: 7