Mohammadreza Yektamaram
Mohammadreza Yektamaram

Reputation: 1514

Performance improvement by priorities joining the tables

Have a performance question about joining tables in Kafka, currently the topology defined as the following code:

table1
   .leftJoin(table2, Pair::with)
   .leftJoin(table3, Pair::add)
   .join(table4, (left) -> left.getValue(0).getId() Triplet::add)
   .leftJoin(table5, Quartet::add)
   .leftJoin(table6, Quintet::add)

I just wanted to know if I move the .join before others, can be improved on the performance and speed of consuming the data? (like below code):

table1
   .join(table4, (left) -> left.getValue(0).getId() Pair::with)
   .leftJoin(table2, Pair::add)
   .leftJoin(table3, Triplet::add)
   .leftJoin(table5, Quartet::add)
   .leftJoin(table6, Quintet::add)

Upvotes: 0

Views: 45

Answers (1)

Huy Nguyen
Huy Nguyen

Reputation: 2061

Yes, performance will be improved. Let assume database provider don't do other things such as automatically optimize query.

Way 1: A left join B left join C inner join D
1.A left join B => Full records A
2.A left join C => Full records A
3.A inner join D => Partial A


Way 2: A inner join D left join B left join C
1.A inner join D => Partial A => A1( significantly improvement here)
2.A1 left join B => Full A1 
3.A1 left join C => Full A1

At step 1, way 2 have reduced the number of rows in DB => less records which are used for left join B and C.

Upvotes: 1

Related Questions