KarlP
KarlP

Reputation: 5181

Kafka Streams KTable-KTable foreign key join emits null even if right side is empty

What is the semantics for a Kafka Streams (3.7.1) KTable-KTable foreign key join, where the extracted foreign key has never matched against the primary key in the right-side ktable?

In this example the right-side is empty and nothing matches.

        ...
        KTable<String, String> personsWithNiceName = person.join(niceName,
            name -> name,
            (name, nice) -> nice + " " +name))
        .toStream().to("Result");

When writing the first message, nothing matches, and nothing is emitted. This aligns with my expectations.

        inputPersonTopic.pipeInput("1", "Jane");

        outputTopic.readKeyValuesToList().forEach((rec) -> {
            System.out.println("Key: " + rec.key + " Value: " + rec.value);
        });

// nothing is emitted

However; writing any message for the same key again, a tombstone is emitted. This made me a bit sad.

        inputPersonTopic.pipeInput("1", "Jane");

        outputTopic.readKeyValuesToList().forEach((rec) -> {
            System.out.println("Key: " + rec.key + " Value: " + rec.value);
        });

//Key: 1 Value: null

Is this expected? Since there is no previous value and there is nothing to delete, I rather hoped also subsequent values would be suppressed too.

In a real world example, the tombstones cascades down the topology and into the consumers, and cause quite much ado about nothing.

Should I try to suppress this? If so, what is the most efficient way? I can only think of a processor with a store for "seen" entries, that just ignores tombstones for unseen keys.

Upvotes: 0

Views: 28

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62350

It's a known bug you are hitting: https://issues.apache.org/jira/browse/KAFKA-16394

It's strictly not incorrect, as it's an "idempotent tombstone", so your downstream consumers should be able to handle it correct. Of course, it's undesired, and unnecessary downstream load.

If the load is really a problem, your idea to suppress these tombstones might work, but it also does sound expensive to maintain an additional store. So you put the overhead just somewhere else (not sure if you would gain much overall?).

Upvotes: 1

Related Questions