Reputation: 99
I have a mySQL table which represents a company's products. The table shows whether or not two products are compatible with each other.
The table looks somewhat like
Product1 Product2 Compatible?
A A Yes
A B ?
A C No
A D ?
B A ?
B B Yes
B C ?
B D Maybe
C A No
C B ?
C C Yes
C D ?
D A ?
D B Maybe
D C ?
D D Yes
Note that while none of the rows are duplicates, some data is redundant. If A is not compatible with C, then C is obviously not compatible with A. This renders the one row redundant. The reason why I have these rows to begin with is that I built the table using a nested for loop. Would you recommend deleting the rows with duplicate meaning for the sake of saving space? Or leave them there for (possibly?) easier maintenance?
Upvotes: 1
Views: 358
Reputation: 14336
It is almost always worthwhile to properly normalize your data. Lower volume of data and no chance of inconsistency are just the two most obvious reasons why. So if your compatibility is in fact guaranteed to be symmetric, and remain symmetric forever (and not some kind of upward- vs. backward-compatibility...), then yes, you should delete the redundant rows.
The only caveat is that in the future you must either query the compatibility in the canonical order (with the lower product, however you define that, in the first slot of your query), or use a disjunctive query, otherwise you might miss a legitimate combination. (The first of those options is obviously the better solution, since the second reintroduces unnecessary processing effort.)
Upvotes: 0
Reputation: 379
Space is cheap ... there is no need to delete data in today's world. However, that doesn't mean we can't be efficient. If I was coming at this as a database design problem then I would create two tables.
One for products
and
One for compatibility, which you have done.
But in the example above you do not give a reason why you are tracking non-compatibility. If the products are in the compatibility table then that means the are compatible ... if they are not then well that means they are not compatible.
How are you populating these rows ... you never give a reason why you add a row for A to C, but then you also add a row for C to A. Why add the second row at all?
In your table what is the exact data in the table columns for product A and B? product id? product name?
Upvotes: 1