Reputation: 96
I have a consumer application deployed on several ENVs (dev, test, stage & preprod). They all are consuming the same Kafka Topic (means works like multiple consumer of same topic).
I have separate producer applications for all ENVs (dev, test, stage & preprod). While producing message inside the payload it has a field to mention the producer's ENV.
Our requirement is that - Dev ENV's consumer should only consume Dev ENV's producer application's messages. Same goes to other ENVs.
My question is - should I go with Consumer side filtering? Is this will ensure our requirement? How it will ensure our requirement?
Thanks in advance.
Upvotes: 0
Views: 1103
Reputation: 32050
I agree with mike
that using a single topic across environments is not a good idea.
However if you are going to do this, then I would suggest you use a stream processor to create separate topics for your consumers. You can do this in Kafka Streams, ksqlDB, etc.
ksqlDB would look like this:
-- Declare stream over existing topic
CREATE STREAM FOO_ALL_ENVS WITH (KAFKA_TOPIC='my_source_topic', VALUE_FORMAT='AVRO');
-- Create derived stream & new topic populated with message just for DEV
-- You can explicitly provide the target Kafka topic name.
CREATE STREAM FOO_DEV WITH (KAFKA_TOPIC='foo_dev') AS SELECT * FROM FOO_ALL_ENVS WHERE ENV='DEV';
-- Create derived stream & new topic populated with message just for PROD
-- If you don't specify a Kafka topic name it will inherit from the
-- stream name (i.e. `FOO_PROD`)
CREATE STREAM FOO_PROD AS SELECT * FROM FOO_ALL_ENVS WHERE ENV='PROD';
-- etc
Now you have your producer writing to a single topic (if you must), but your consumers can consume from a topic that is specific to their environment. The ksqlDB statements are continuous queries so will process all existing messages in the source topic and every new message that arrives.
Upvotes: 0
Reputation: 18475
You have multiple options on how to deal with this requirement. However, I don't think it is in general a good idea to have one topic for different environments. Looking into data protection and access permissions this doesn't sound like a good design.
Anyway, I see the following options.
Option 1: Use the environment (dev, test, ...) as the key of the topic and tell the consumer to filter by key.
Option 2: Write producers that send data from each environment to individual partitions and tell the consumers for each environment to only read from a particular partition.
But before implementing Option 2, I would rather do Option 3: Have a topic for each environment and let the Producer/Consumer write/read from the differen topics.
Upvotes: 1