Trincão
Trincão

Reputation: 33

Spark reading from Azure Eventhub => StreamingQueryException: Input byte array has wrong 4-byte ending unit

I'm trying to collect Azure Eventhub messages using Spark/Python. Every time, I get the exception "StreamingQueryException: Input byte array has wrong 4-byte ending unit"

Any ideas please?

conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://XXXXXXXXX.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXX=;EntityPath=XXXXXX"
                                      
read_df  = spark.readStream.format("eventhubs").options(**conf).load()
stream = read_df.writeStream.format("console").start()
stream.awaitTermination()

Upvotes: 3

Views: 2767

Answers (1)

tintaglia
tintaglia

Reputation: 76

Note that for Version 2.3.15 and above, you need to encrypt the connection string in the configuration dictionary:

ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)

https://github.com/Azure/azure-event-hubs-spark/blob/master/docs/PySpark/structured-streaming-pyspark.md#event-hubs-configuration

Upvotes: 6

Related Questions