JDev
JDev

Reputation: 1832

How to write result of streaming query to multiple database tables?

I am using spark structured streaming and reading from Kafka topic. The goal is to write the message to PostgreSQL database multiple tables.

The message schema is:

 root
  |-- id: string (nullable = true)
  |-- name: timestamp (nullable = true)
  |-- comment: string (nullable = true)
  |-- map_key_value: map (nullable = true)
    |-- key: string
    |-- value: string (valueContainsNull = true)

While writing to one table after dropping map_key_value works with below code:

My write code is:

message.writeStream.foreachBatch { (batchDF: DataFrame, batchId: Long) =>
    batchDF.write.format("jdbc").option("url", "url")
      .option("user", "username")
      .option("password", "password")
      .option(JDBCOptions.JDBC_TABLE_NAME, "table_1')
      .mode(SaveMode.Append).save();
  }.outputMode(OutputMode.Append()).start().awaitTermination()

I want to write the message to two DB tables table 1(id, name, comment) and tables 2 need have the map_key_value.

Upvotes: 2

Views: 1207

Answers (1)

Ged
Ged

Reputation: 18098

You will need N streaming queries for N sinks; t1 and t2 both count as a separate sink.

writeStream does not currently write to jdbc so you should use foreachBatch operator.

Upvotes: 1

Related Questions