Reputation: 3222
Current Design
Previously my co-worker designed a databse which had tables like customer_0, customer_1.... to customer_9 wherby all the customer ids are split into 10 different tables based on the last digit of the id.
The Problem with this design:
My Design
On tring to figure out a work-around I came to know that you can do table partitioning which exactly solved my problem. Refereing to this question.
The Prblem with partitioning approach
Now the problem with this approach is that you cannot have a foreign key constraint anyway in partitions, so this doesnt solve the problem.
DB Sharding or "Shared Nothing"
Then I came across this, in which you use schema replication and what I understand is replicate schemas on different physical locations, and thus the respective schema is queried based on the specific shard key.
My Question
What should I do now, I cannot let go of the foreign key constraint, choosing table partitioning. Should I let go of all the partitioning and sharding and just focus on conventional schema, and leave the sharding part to the DBA?
Note: max expected customer base is 10 million.
Upvotes: 1
Views: 2974
Reputation: 41478
Yes, let go of partitioning and sharding for now--stick with a conventional, simple schema. You've probably got many easier fruit to pick that can get you the performance needs with the ability to have FK constraints at the data size you note.
All the 'sharding' you're doing seems like someone took a swing at premature optimizations for a future that's not even expected if all you're growing is to 10 Million customers/record.
Also, I really wouldn't classify your situation as 'Big Data', despite how the term gets thrown around everywhere.
Assuming a table with a reasonable number of columns, say less than 30 columns of less than 32 bytes each (char(32)
), 10 million rows is nothing for Mysql to handle when properly indexed and given adequate memory to hold the innodb tables in memory (I'm assuming you're using innodb). I'm currently working with tables a factor of 10x that size on a AWS xlarge RDS instance without any issues outside the occassional time it takes to do a sql dump or time it takes to do table alterations.
I'd combine all the various customer tables into a single table and take a long look at all your queries that hit it. Run explain's on them to see where you really need indexes. Keep the FK constraints as you need them and makes sure you have suitable covering indexes as need.
I doubt you need table partitioning to get good performance on the size of data you've indicated.
Upvotes: 3