nihil0
nihil0

Reputation: 369

CSV reference data in Azure Stream Analytics

I have a stream analytics application where events are JSON encoded and look like this

{customerID: 45, state:"S2" , timestamp:"2017-06-06 14:19:21.77"}
{customerID: 74, state:"S2" , timestamp:"2017-06-06 14:19:26.61"}
{customerID: 79, state:"S2" , timestamp:"2017-06-06 14:19:28.50"}
{customerID: 10, state:"D" , timestamp:"2017-06-06 14:19:31.79"}
{customerID: 70, state:"S2" , timestamp:"2017-06-06 14:19:31.93"}
{customerID: 37, state:"S2" , timestamp:"2017-06-06 14:19:32.17"}
{customerID: 41, state:"D" , timestamp:"2017-06-06 14:19:33.48"}

I have reference data for customers in a CSV file that looks like this:

"CUST_ID", "Age", "Rich"
1, "50", "Y"
2, "22", "N"

I load the data files in the aforementioned formats and test the following query

select A.[CUSTOMERID], A.[state], B.[AGE], B.[GENDER_CODE] from clickstream A timestamp by A.[TIMESTAMP] left join refdata B on A.[CUSTOMERID]=B.[CUST_ID]

I get the following error message with no details: error

Now, the same query works perfectly if the data reference data is represented as JSON. Is there a working example I can have a look at with CSV reference data?

Upvotes: 3

Views: 1880

Answers (3)

natpa
natpa

Reputation: 31

I had the same problem with stream analytics.

Just FYI, converting the .csv file to .json file (using a online tool for example), and changing the reference data serialization format to JSON solved the problem. Now I can test the query by submitting the json-file trough the Azure portal query tool. Seems like CSV implementation of stream analytics is buggy, as I'm sure that my csv was correctly formatted.

Upvotes: 1

Ivan Fateev
Ivan Fateev

Reputation: 1061

"Test" button with reference data didn't work for me at all. I had to use Azure Stream analytics tools for Visual Studio to test it in local environment.

Regarding an error in path pattern. I used YYYY-MM-DD and by default stream analytics used YYYY/MM/DD. When the job was running, I saw an exclamation mark next to input in Stream analytics dashboard.

enter image description here

To debug error you should click on reference data input in Stream Analytics overview and you should see warnings on its blade.

Mine warning was like: Initializing input without a valid reference data blob for UTC time 11/14/2017 12:22:32 PM, example path: 'https://blabla.blob.core.windows.net/cohorts/2017/11/14/12-22/result.csv'

Upvotes: 0

Fei Han
Fei Han

Reputation: 27793

I have reference data for customers in a CSV file

When we create input, we need to specify Event serialization format which let Stream Analytics know which serialization format (JSON, CSV, or Avro) we're using for incoming data streams. Please check the Event serialization format of your input refdata, and make sure if it is set to CSV.

enter image description here

Upvotes: 1

Related Questions