Reputation: 172
I have been trying the final days to come with a solution to the following question.
Lets suppose that we have the following two tables.
Film(ID',Title,Country,Production_Date)
Actor(ID',Name,Genre,Nationality)
Cast(Actor_ID',Film_ID',Role)
Given information:
Film holds N(film)=50.000 records, r(film)=40bytes, sequential organized, index on PK
Actor holds N(actor)=200.000 records r(actor)=80bytes,heap organized, index on PK
Cast holds N(cast)=100.000 records,r(cast)=25 bytes, heap organized, No INDEXES
The execution tree and relation expression for an execution plan is in the following picture:
For the lower level join between cast & film I'm calculating the followings:
Block Nested Loop Join : Bcast x Bfilm
Index Nested Loop Join : Bcast + Ncast x Cfilm
I'm keeping the smallest value which is given with an INLJ.
Question:
Now how can I calculate the size of the joined table and the new r which is the size of a record on the new joined table in order to proceed and calculate the upper level join between the already joined table with table actor after having calculated the cost B in blocks that join operation will take?
Upvotes: 0
Views: 787
Reputation: 765
I assume you want to do a natural join on FILM.ID = CAST.FILM_ID and CAST.FILM_ID is a foreign key referencing FILM.ID.
1) Size of one row:
A join of Film and Cast results in tuples of the form
[FILM_ID, TITLE, COUNTRY, PRODUCTION_DATE, ACTOR_ID, ROLE].
Hence the row size should be something like
R(FILM JOIN CAST) = R(FILM) + R(CAST) - R(FILM_ID)
since the FILM_ID is the only column which is shared.
2) Number of rows:
N(FILM JOIN CAST) = N(CAST)
As there is exactly one row in FILM for every row in CAST.
Upvotes: 1