Reputation: 49817
i was wondering if to use 2 tables is better then using 1 single table .
Scenario:
i have a simple user table and a simple user_details table. i can JOIN tables and select both records.
But i was wondering if to merge 2 table into 1 single table.
what if i have 2milions users records in both tables?
in terms of speed and exec time is better to have a single table when selecting records?
Upvotes: 2
Views: 207
Reputation: 75095
To denormalize or not to denormalize, that is the question...
There is no simple, one-size-fits all response to this question. It is a case by case decision.
In this instance, it appears that there is exactly one user_detail record per record in the user table (or possibly either 1 or 0 detail record per user record), so shy of subtle caching concerns, there is really little no penalty for "denormalizing". (indeed in the 1:1 cardinality case, this would effectively be a normalization).
The difficulty in giving a "definitive" recommendation depends on many factors. In particular (format: I provide a list of questions/parameters to consider and general considerations relevant to these):
If there is a relative little amount of writes compared with reads, it would be possible to create many indexes, some covering the most common queries, and hence logically re-creating of sort, in a more flexible fashion the two (indeed multiple) table setting. The downside of too many covering indices could of course be to occupy too much disk space (not a big issue these days) but also to possibly impede (to some extent) the cache. Also too many indices may put undue burden on write operations...
If any one of the record average size is "unusually" long, say above 400 bytes, a multi-table may be appropriate. After all, an somewhat depending on the type of filtering done by the queries, the JOIN operation are typically very efficiently done by MySQL, and there is therefore little penalty in keeping separate table.
If it isn't the case i.e if we have user records with more than one user_details, given the relatively small number or records (2 millions) (Yes, 2M is small, not tiny, but small, in modern DBMS contexts), denormalization would probably be a bad idea. (possible exception with cases where we query several dozens of time per second the same 4 or 5 fields, some from the user table, some from the user_detail table.
Bottom lines:
Upvotes: 2
Reputation: 28016
You should easily be able to make either scenario perform well with proper indexing. Two million rows is not that many for any modern RDBMS.
However, one table is a better design if rows in the two tables represent the same logical entity. If the user table has a 1:1 relationship with the user_detail table, you should (probably) combine them.
Edit: A few other answers have mentioned de-normalizing--this assumes the relationship between the tables is 1:n (I read your question to mean the relationship was 1:1). If the relationship is indeed 1:n, you absolutely want to keep them as two tables.
Upvotes: 4
Reputation: 26910
The current trend is denormalize (i.e. put them in the same table). It usually give better performance, but easier to make inconsistent (programming mistake, that is).
Plan: determine your workload type.
Benchmark: See if the performance gain worth the risk.
Upvotes: 1
Reputation: 23972
Joins themselves are not inherently bad; RDBMS are designed to perform joins very efficiently—even with millions or hundreds of millions of records. Normalize first before you start to de-normalize, especially if you're new to DB design. You may ultimately end up incurring more overhead maintaining a de-normalized database than you would to use the appropriate joins.
As to your specific question, it's very difficult to advise because we don't really know what's in the tables. I'll throw out some scenarios, and if one matches yours, then great, otherwise, please give us more details.
If there is, and will always be a one-to-one relationship between user and user_details, then user details likely contains attributes of the same entity and you can consider combining them.
If the relationship is 1-to-1 and the user_details contains LOTS of data for each user that you don't routinely need when querying, it may be faster to keep that in a separate table. I've seen this often as an optimization to reduce the cost of table scans.
If the relationship is 1-to-many, I'd strongly advice against combining them, you'll soon wish you hadn't (as will those who come after you)
If the schema of user_details changes, I've seen this too where there is a core table and an additional attribute table with variable schema. If this is the case, proceed with caution.
Upvotes: 2
Reputation: 52107
Denormalization will generally use-up more space while affording better query performance.
Be careful though - cache also matters and having more data effectively "shrinks" your cache! This may or may not wipe-out the theoretical performance benefit of merging two tables into one. As always, benchmark with representative data.
Of course, the more denormalized your data model is, the harder it will be to enforce data consistency. Performance does not matter if data is incorrect!
So, the answer to your question is: "it depends" ;)
Upvotes: 1