Reputation: 33
I'm trying to collect Azure Eventhub messages using Spark/Python. Every time, I get the exception "StreamingQueryException: Input byte array has wrong 4-byte ending unit"
Any ideas please?
conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://XXXXXXXXX.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXX=;EntityPath=XXXXXX"
read_df = spark.readStream.format("eventhubs").options(**conf).load()
stream = read_df.writeStream.format("console").start()
stream.awaitTermination()
Upvotes: 3
Views: 2767
Reputation: 76
Note that for Version 2.3.15 and above, you need to encrypt the connection string in the configuration dictionary:
ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)
Upvotes: 6