quervernetzt
quervernetzt

Reputation: 11621

How to use JSON file formats in the context of Azure Data Lake Analytics respectively usql

I have a JSON input that looks like

{
    "sessionId": 1234,
    "deviceId": "MAC:1234",
    "IoTHub": {
        "MessageId": "1234-1234-1234-1234"
    }
}

How can I extract the values of sessionId, deviceId and MessageId in a Azure Datalake Analytics usql script?

Upvotes: 1

Views: 638

Answers (1)

quervernetzt
quervernetzt

Reputation: 11621

How to use JSON file formats in the context of Azure Data Lake Analytics respectively usql

Set Up process

  • Download the repo from [1]
    • Open the solution in .\Examples\DataFormats\Microsoft.Analytics.Samples.sln
    • Build the solution
    • Get .\Examples\DataFormats\Microsoft.Analytics.Samples.Formats\bin\Debug\Microsoft.Analytics.Samples.Formats.dll
    • Get .\Examples\DataFormats\Microsoft.Analytics.Samples.Formats\bin\Debug\Newtonsoft.Json.dll
  • In ADLS create a folder where to store the assemblies (e.g. .\assemblies)
    • Create a sample file in the folder if it disappears (happened to me)
  • With Visual Studio:
    • Add both files to the folder of the ADLA solution (e.g. .\lib\...)
    • Open Cloud Explorer, navigate to the ADLA Database -> Assemblies -> right click and register the assemblies
      • Store the assemblies in the folder created before in ADLS
  • Reference the assemblies as can be seen below

Usage

JSON Input Data

  • Use the Extractor as e.g. shown below
    • For more information see [2] and [3]

Resources

[1] GitHub Azure USQL

[2] GitHub Azure USQL DataFormats

[3] U-SQL - Extract data from json-array


U-SQL Script

DECLARE @localDevelopment bool = true;

IF @localDevelopment == true THEN
    DROP ASSEMBLY IF EXISTS [Newtonsoft.Json];
    DROP ASSEMBLY IF EXISTS [Microsoft.Analytics.Samples.Formats];
    CREATE ASSEMBLY [Newtonsoft.Json] FROM @"/lib/Newtonsoft.Json.dll";
    CREATE ASSEMBLY [Microsoft.Analytics.Samples.Formats] FROM @"/lib/Microsoft.Analytics.Samples.Formats.dll";
    DECLARE @input string = @"/data/input.json";
    DECLARE @output string = @"/data/output.csv";
ELSE
    DECLARE @input string = @"/data/input.json";
    DECLARE @output string = @"/data/output.csv";
END;

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

//Define schema of file, must map all columns
//Names must match keys
@extractDataFirstLevel = 
    EXTRACT sessionId int,
            deviceId string,
            IoTHub string
            //Date DateTime
    FROM @input
    USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@selectData =
    SELECT sessionId,
           deviceId,
           Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(IoTHub)["MessageId"] AS messageId
    FROM @extractDataFirstLevel;

OUTPUT @selectData
TO @output
USING Outputters.Csv();

Upvotes: 1

Related Questions