Reputation: 33
Currently I am building quite big web system and I need strong SQL database solution. I chose Mysql over Postgres because some of tasks needs to be read-only (MyISAM engine) and other are massive-writes (InnoDB).
I have a question about this read-only feature. It has to be extremely fast. User must get answer a lot less than one second.
Let say we have one well-indexed table named "object
" with not more than 10 millions of rows and another one named "element
" with around 150 millions of rows.
We also have table named "element_object
" containing information connecting objects from table "element
" with table "object
" (hundreds of millions of rows)
So we're going to do partitioning on tables "element
" and "element_object
" and have 8192 tables "element_hash_n{0..8191}a
" and 24576 of tables "element_object_hash_n{0..8191}_m{0..2}
".
An Answer on user's question would be a 2-step searching:
I wonder about first step: What would be better:
I'm sorry for my English. Its not my native language.
Upvotes: 3
Views: 130
Reputation: 1467
I think you make way to many partitions. If you have more than 32000 partitions you have a tremendous overhead of management. Given the name element_hash_* it seams as if you want to make a hash of your element and partition it this way. But a hash will give you a (most likely) even distribution of the data over all partitions. I can't see how this should improve performance. If your data is accessed over all those partitions you don't gain anything by having partitions in size of your memory - you will need to load for every query data from another partition.
We used partitions on a transaction systems where more than 90% of the queries used the current day as criteria. In such a case the partition based on days worked very well. But we only had 8 partitions and moved the data then off to another database for long time storage.
My advice: Try to find out what data will be needed that fast and try to group it together. And you will need to make your own performance tests. If it is so important to deliver data that fast there should be enough management support to build a decent test environment. Maybe your test result will show that you simply can't deliver the data fast enough with a relational database system. If so you should look at NoSQL (as in Not only SQL) solutions.
In what technology do you build your web system? You should test this part as well. A super fast database will not help you much if you lose the time in a poorly performing web application.
Upvotes: 2