rdr2
rdr2

Reputation: 1

Find the Cardinality of Natural Join

|X| represents number of tuples in X
bold letters represent keys in the relation

Consider the relations R(A, B) and S(A, C), and that R has a foreign key on A that references S.
|R ✶ S| (where ' * ' represents natural join) is:
The options are:
1. |R|
2. |S|
3. |R|.|S|
4. max(|R|, |S|)
5. min(|R|, |S|)

What I understand about the cardinality of natural join is that if there is no common attribute among the two relations then natural join will act like a cross-product and the cardinality will be r * s. But I don't understand how key constraints play a role in determining the cardinality. Can someone please explain?

Upvotes: -2

Views: 1572

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270773

There is absolutely not enough information to answer this question. The "natural" join can return any almost any value between 0 and R*S. The following are examples.

This example returns 12:

create table s1 (id int primary key);
create table r1 (s1_id int references s1(id));

insert into s1 (id) values (1), (2), (3);

insert into r1 (s1_id) values (1), (2), (2), (3);

This example returns 0:

create table s2 (id int primary key, x int default 2);
create table r2 (s2_id int references s2(id), x int default 1);

insert into s2 (id) values (1), (2), (3);

insert into r2 (s2_id) values (1), (2), (2), (3);

This examples returns 4:

create table s3 (id int primary key, y int default 2);
create table r3 (id int references s3(id), x int default 1);

insert into s3 (id) values (1), (2), (3);

insert into r3 (id) values (1), (2), (2), (3);

In all of these, r has a foreign key relationship to s. And a "natural" join is being used. Here is a db<>fiddle.

Even if you assume that the "A"s are the primary keys AND that there are no other columns, the number of rows still varies:

-- returns 4
create table s5 (id int primary key);
create table r5 (id int references s4(id));

insert into s5 (id) values (1), (2), (3);    
insert into r5 (id) values (1), (1), (2), (2);

Versus:

-- returns 0
create table s4 (id int primary key);
create table r4 (id int references s4(id));

insert into s4 (id) values (1), (2), (3);

insert into r4 (id) values (NULL), (NULL), (NULL), (NULL);

Upvotes: -1

AntC
AntC

Reputation: 2806

Presuming the bold A in each schema means it is a key; and presuming the Foreign Key constraint holds -- that is, the A value for every row in R does correspond to an A value in S:

  • Every row in R naturally joins to a row in S on A.
  • There might be rows in S that don't join to R (because there's no Foreign Key constraint to enforce that).
  • So the cardinality of the joined relations is the cardinality of R, answer 1.

Is there are real-life use for a schema like this? Consider S is Customer Name in C, keyed by Customer number in A. R holds date of birth in B, also keyed by Customer number in A. Every Customer must have a name; it's true every Customer (person) must have a d.o.b., but we don't need to record that unless/until they purchase age-restricted items.

Upvotes: 2

Related Questions