Alpcan Yıldız
Alpcan Yıldız

Reputation: 751

How to process and aggregate Kafka Streams with custom Objects?

So basically I have Account class. I have data. I want to send those objects into my topic with producer. That is okay for now. Later on, I want to do aggregation with Kafka Streams but I can not because some Serde properties is wrong in my configuration, I think :/. I dont know where the error is. My producer works fine, but I can't aggregate. Anyone help me to look my kafka streams code please? My Account class:

public class Account {

    private long fromId;
    private long amount;
    private long toId;
    private ZonedDateTime time;
}

There 2 classes Serializer and Deserializer for my Account class. Serializer:

public class AccountSerializer implements Serializer {

    private static final Charset CHARSET = Charset.forName("UTF-8");
    static private Gson gson = new Gson();


    @Override
    public void configure(Map map, boolean b) {

    }

    @Override
    public byte[] serialize(String s, Object o) {
        String line = gson.toJson(o);
        // Return the bytes from the String 'line'
        return line.getBytes(CHARSET);
    }



    @Override
    public void close() {

    }
}

Deserializer:

public class AccountDeserializer implements Deserializer {
    private static final Charset CHARSET = Charset.forName("UTF-8");
    static private Gson gson;

    static {
        gson = new Gson();
    }

    @Override
    public void configure(Map map, boolean b) {

    }

    @Override
    public Object deserialize(String s, byte[] bytes) {
        try {
            // Transform the bytes to String
            String person = new String(bytes, CHARSET);
            // Return the Person object created from the String 'person'
            return gson.fromJson(person, Account.class);
        } catch (Exception e) {
            throw new IllegalArgumentException("Error reading bytes! Yanlış", e);
        }
    }

    @Override
    public void close() {

    }
}

My AccountSerde class for kafka streams:

public class AccountSerde implements Serde<Object> {

    private AccountSerializer accountSerializer;
    private AccountDeserializer accountDeserializer;

    @Override
    public void configure(Map<String, ?> map, boolean b) {


    }

    @Override
    public void close() {
        accountSerializer.close();
        accountDeserializer.close();

    }

    @Override
    public Serializer<Object> serializer() {
        return accountSerializer;
    }

    @Override
    public Deserializer<Object> deserializer() {
        return accountDeserializer;
    }
}

And my Kafka Producer:

 public static void main(String[] args) {

        DataAccess dataAccess = new DataAccess();
        List<Account> accountList = dataAccess.read();

        final Logger logger = LoggerFactory.getLogger(Producer.class);
        Properties properties = new Properties();

        properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"127.0.0.1:9092");
        properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,LongSerializer.class.getName());
        properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,AccountSerializer.class.getName());


        KafkaProducer<Long,Account> producer = new KafkaProducer<>(properties);



        for (Account account : accountList) {

            ProducerRecord<Long,Account> record = new ProducerRecord<Long, Account>("bank_account",account.getFromId(),account);


            producer.send(record, new Callback() {
                public void onCompletion(RecordMetadata recordMetadata, Exception e) {
                    if (e == null) {


            logger.info("Record sent successfully. \n "+ "Topic : "+recordMetadata.topic() +"\n"+
                            "Partition : " + recordMetadata.partition() + "\n"+
                            "Offset : " +recordMetadata.offset() +"\n"+
                            "Timestamp: " +recordMetadata.timestamp() +"\n");
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e1) {
                        e1.printStackTrace();
                    }

                }
                else{
                    logger.info("Error sending producer");
                }
            }
        });
    }


    producer.flush();
    producer.close();
}

And here is class where I want to try aggregation, my Kafka Stream class.

  public static void main(String[] args) {
        System.out.println();

        Properties properties = new Properties();
        properties.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG,"127.0.01:9092");
        properties.setProperty(StreamsConfig.APPLICATION_ID_CONFIG,"demo-kafka-streams");
        properties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,AccountDeserializer.class.getName());
        properties.setProperty(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.LongSerde);
        properties.setProperty(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, AccountSerde.class.getName());
        //create a topology

        StreamsBuilder streamsBuilder = new StreamsBuilder();

        KStream<Long, Account> inputTopic = streamsBuilder.stream("bank_account");

        KTable<Long, Long> aggregate = inputTopic.groupByKey().aggregate(
                () -> 0L,
                (key, current, oldBalance) -> current.getAmount() + oldBalance);

        aggregate.toStream().to("son");

        KafkaStreams streams = new KafkaStreams(streamsBuilder.build(),properties);

        streams.start();

        System.out.println(streams.toString());

        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));


    }

I tried my producer is working fine and sends objects. However because of error I can't try whether my aggregation code is working or not. It gives me

[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] ERROR org.apache.kafka.streams.errors.LogAndFailExceptionHandler - Exception caught during Deserialization, taskId: 0_0, topic: bank_account, partition: 0, offset: 0
java.lang.NullPointerException
    at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:63)
    at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
    at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:97)
    at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
    at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:638)
    at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:936)
    at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:831)
    at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
    at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] State transition from RUNNING to PENDING_SHUTDOWN
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] Shutting down
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1-producer] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.streams.KafkaStreams - stream-client [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89] State transition from RUNNING to ERROR
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] WARN org.apache.kafka.streams.KafkaStreams - stream-client [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89] All stream threads have died. The instance will be in error state and should be closed.
[demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] INFO org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1] Shutdown complete
Exception in thread "demo-kafka-streams-9e3b0ab8-c021-4707-bf85-174e1356ea89-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately.
    at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:80)
    at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:97)
    at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:117)
    at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:638)
    at org.apache.kafka.streams.processor.internals.StreamThread.addRecordsToTasks(StreamThread.java:936)
    at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:831)
    at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
    at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
Caused by: java.lang.NullPointerException
    at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:63)
    at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
    ... 7 more

Upvotes: 1

Views: 5830

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191963

You're never initializing the fields, so you're getting a NPE

You should also change the Serde type to your actual class

public class AccountSerde implements Serde<Account> {

    // These are both null unless you initialize them 
    private AccountSerializer accountSerializer;
    private AccountDeserializer accountDeserializer;

Also, you'll need to fix your IP address from this value, which is not a valid IP

"127.0.01:9092"

Upvotes: 4

Related Questions