reith
reith

Reputation: 2088

Consume Spark SQL dataset as RDD based job

Spark dataframe have toRDD() method but I don't understand how It's useful. Can we start a SQL streaming job by processing converted source dataset to RDD instead of making and starting DataStreamWriter?

Upvotes: 1

Views: 109

Answers (1)

Alper t. Turker
Alper t. Turker

Reputation: 35229

Dataset provides uniform API for both batch and streaming processing and not every method is applicable to streaming Datasets. If you search carefully, you'll find other methods which cannot be used with streaming Datasets (for example describe).

Can we start a SQL streaming job by processing converted source dataset to RDD instead of making and starting DataStreamWriter?

We cannot. What starts in structured streaming, stays in structured streaming. Conversions to RDD are not allowed.

Upvotes: 1

Related Questions