user1079877
user1079877

Reputation: 9358

How to filter messages before passing them on to consumers?

I'm creating a lead and event management system with Kafka. The problem is we are getting many fake leads (advertisement). We also have many consumer in our system. Is there anyway to filter advertisement before going to consumers? My solution is to write everything into the first topic, then read it by a filter consumer, then write it back to the second topic or filter it. But I'm not sure if it's efficient or not. Any idea?

Upvotes: 15

Views: 45899

Answers (4)

mancini0
mancini0

Reputation: 4703

Take a look at Confluent's KSQL. (It's free and open source, https://www.confluent.io/product/ksql/.) It uses Kafka Streams under the hood, you can define your ksql queries and tables on the server side, the results of which are written to kafka topics, so you could just consume those topics, instead of writing code to create a intermediary filtering consumer. You'd only need to write the ksql table "ddl" or queries.

Upvotes: 0

JongHyok Lee
JongHyok Lee

Reputation: 223

You can use Kafka Streams (http://kafka.apache.org/documentation.html#streamsapi) with 0.10.+ version of Kafka. It's exactly for your use case i think.

Upvotes: 11

Jeff Gong
Jeff Gong

Reputation: 1863

Yes -- in fact I am mostly convinced that this is the way you're supposed to handle a problem in your context. Because Kafka is only useful for the efficient transmission of data, there is nothing it itself can do in terms of cleaning your data. Consume all the information you get by an intermediary consumer that can run its own tests to determine what passes its filter and push to a different topic / partition (based on your needs) to get the best data back.

Upvotes: 6

Related Questions