awisha
awisha

Reputation: 93

Get stream of data from mqtt using python(pyspark) in spark version 2.2.0

I added the dependency in spark's POM.xml, given in the following link:

http://bahir.apache.org/docs/spark/current/spark-sql-streaming-mqtt/

Built spark using maven again. But as we can see it shows only Java and Scala supports to get data from mqtt.

I want to get stream data from mqtt in python. In earlier versions we had a pyspark.streaming.mqtt for the same. What is the similar to this in spark 2.2.0 pyspark. I am using mosquitto for mqtt broker.

Upvotes: 5

Views: 1307

Answers (1)

Alper t. Turker
Alper t. Turker

Reputation: 35249

For PySpark you can use Structured Streaming bindings (you have to include Bahir jar):

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()  # type: SparkSession
(spark
    .readStream
    .format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
    .load("tcp://{}".format(broker_uri)))

Upvotes: 2

Related Questions