Reputation: 1514
I have a question about performance of my MYSQL database design.
Table A has a lot of records, say a million, and table B also has a million. There is another table C in which every record id of A is connected to every row in B and this connection has an additional value 1 or 0. So functionally speaking, every record in A has a boolean vector where B contains the 'variables' of the vector and 1 or 0 is the value. It's explained more graphically in the image on bottom.
Table C will have a lot of write and read actions (select all values from a record of A), so the the table is very actively used. And table C is really long, a million times a million rows.
Upvotes: 2
Views: 161
Reputation: 95072
The table design is fine and shouldn't be a problem, because you access records via IDs which should be indexed. Depending on your typical queries you should also consider adding composite indexes (c(a_id,b_id)
, c(a_id,value)
, c(b_id,value)
, c(a_id,b_id,value)
).
However, as there exist only two states, 0 and 1, you may decide only to store one of them. I.e. if you store all state 1 records only, all pairs not in the table have state 0 then implicitly. This pays especially when the states are unevenly distributed (say 90% of the records have state 0 and only 10% have state 1) or you usually access only one of the states (e.g. you always look for 1s).
Upvotes: 2
Reputation: 597
Millions of records in a table with multiple read and write won't be a bottleneck if you are following best practices of mysql.
Your engine should be innodb.
Your select queries should not involve a full table scan.
Your table should have desired indexes.
You should look for all your possible use cases, because either way is a good idea if a use case supports it.
If you split your data across multiple tables than join operation is to be performed if needed.
Upvotes: 1