MaatDeamon
MaatDeamon

Reputation: 9771

non-key join with GlobalTable vs Ktable-Ktable Join?

I am quite curious as to why a non-key join works with GlobalKTtable vs KTable-KTable ? Although I understand why we don't need co-partitioning for a globalKTable (BroadCast Join), I don't not understand what enable the non-key join with it ? Can anyone, give a rough idea of what is happening ?

Upvotes: 0

Views: 1795

Answers (1)

Nishu Tayal
Nishu Tayal

Reputation: 20860

GlobalKTable & KTable, both represent the abstraction of changelog, but the difference is KTable is created locally for each application instance for each partition while GlobalKTable is populated with the entire data from all the partitions on each application instance. It copies whole data on each application instance that means entire dataset is available for querying on each instance. Hence it doesn't require co-partitioning and the lookups are possible in the entire table.

In the below example :

KStream<String, Long> left = ...;  // // KStream has string type key
GlobalKTable<Integer, Double> right = ...;   // GlobalKTable has integer type key

// Java 8+ example, using lambda expressions
KStream<String, String> joined = left.leftJoin(right,
    (leftKey, leftValue) -> leftKey.length(), /* derive a (potentially) new key by which to lookup against the table */
    (leftValue, rightValue) -> "left=" + leftValue + ", right=" + rightValue /* ValueJoiner */
  );

Select a key from the left stream using KeyValueMapper which you can use to lookup in GlobalKTable as given below:

(leftKey, leftValue) -> leftKey.length(), /* select a (potentially) new key by which to lookup against the table */

GlobalKTable are convenient for joins but expensive as it requires more storage as compared to KTables and also increases the network & kafka broker load.

Upvotes: 2

Related Questions