What is the remedy of lossy-join decomposition?

Question

I know what is lossless decomposition and lossy decomposition. In lossy decomposition we get some wrong informations. But what is the remedy? I think if we join on primary key it will be not lossy?

Am I right? If I am will it be lossless also?

laurids · Accepted Answer

To answer your question: to achieve a lossless decomposition you have to split on the functional dependencies. It seems that you already know the (formal) theory behind it, so I'll try to give you a glimpse of why that is true.

Say you have a relation R

R(author_id, author_name, post_id, post_date, post_title)

with FDs

author_id -> { author_name }
post_id -> { post_date, post_title }

Now, a good decomposition would be

(author_id, author_name, post_id) (post_id, post_date, post_title)

because post_id determines post_date and post_title, thus becoming a key. Since post_id is a key in r2, every row in r2 is a different row and we can join safely. Note that the primary key on the original relation was (author_id, post_id), which is more than we actually need.

On the other hand, a bad decomposition would be

(author_id, author_name, post_id, post_date) (post_date, post_title)

because there is no FD post_date -> post_title, thus post_date is not a key in r2, thus post_date could have duplicate values. It should be obvious from a intuitive perspective that joining on a potential duplicate value will get us a ton of phanotom rows.

What is the remedy of lossy-join decomposition?

Answers (1)

Related Questions