Structured Spark Streaming multiple writes

Question

I am using a data stream to be written to a kafka topic as well as hbase. For Kafka, I use a format as this:

dataset.selectExpr("id as key", "to_json(struct(*)) as value")
        .writeStream.format("kafka")
        .option("kafka.bootstrap.servers", Settings.KAFKA_URL)
        .option("topic", Settings.KAFKA_TOPIC2)
        .option("checkpointLocation", "/usr/local/Cellar/zookeepertmp")
        .outputMode(OutputMode.Complete())
        .start()

and then for Hbase, I do something like this:

  dataset.writeStream.outputMode(OutputMode.Complete())
    .foreach(new ForeachWriter[Row] {
      override def process(r: Row): Unit = {
        //my logic
      }

      override def close(errorOrNull: Throwable): Unit = {}

      override def open(partitionId: Long, version: Long): Boolean = {
        true
      }
    }).start().awaitTermination()

This writes to Hbase as expected but doesn't always write to the kafka topic. I am not sure why that is happening.

Structured Spark Streaming multiple writes

Answers (1)

Related Questions