coryseaman
coryseaman

Reputation: 387

Using JSON path in tabular input data for JSON output schema in an Azure Data Factory flow

I've seen how to use the Aggregate transformation in an ADF Data Flow along with a static hierarchy specified in a Derived Columns transformation and the collect() function to output custom JSON.

What I want to do is a little more dynamic. If my tabular data contains one column with a single attribute value, and another column with a string representing the JSON path to which I want the first column's attribute value to be output in the JSON schema, how would this sort of transformation be achieved?

Table Input:

CREATE TABLE EAV (EntityID int, AttributePath nvarchar(500), AttributeValue nvarchar(max))

INSERT EAV VALUES
(1, 'user.firstName','John'),
(1, 'user.lastName','Doe'),
(1, 'user.address.city','Pittsburgh'),
(1, 'user.address.state','Pennsylvania'),
(2, 'user.firstName','Jane'),
(2, 'user.lastName','Doe'),
(2, 'user.address.city','Pittsburgh'),
(2, 'user.address.state','Pennsylvania');

JSON Output:

{
    "user": {
        "id" : 1
        "firstName": "John",
        "lastName": "Doe",
        "address": {
            "city": "Pittsburgh",
            "state": "Pennsylvania"
        }
    },
    "user": {
        "id" : 2
        "firstName": "Jane",
        "lastName": "Doe",
        "address": {
            "city": "Pittsburgh",
            "state": "Pennsylvania"
        }
    },
}

Upvotes: 0

Views: 117

Answers (1)

Charlieface
Charlieface

Reputation: 71144

SQL Server doesn't deal well with dynamic JSON keys. You can use dynamic SQL for this

Basically, we take the distinct path expression, pivot them out, and select them using FOR JSON PATH

DECLARE @cols nvarchar(max) = (
    SELECT STRING_AGG(QUOTENAME(AttributePath),',')
    FROM (SELECT DISTINCT AttributePath FROM EAV) EAV
);

DECLARE @sql nvarchar(max) = N'
SELECT
    id = EntityID,
    ' + @cols + '
FROM EAV
PIVOT (
    MAX(AttributeValue) FOR AttributePath IN (
        ' + @cols + '
    )
) pvt
FOR JSON PATH;
';

EXEC sp_executesql @sql;

Result

[
   {
      "id":1,
      "user":{
         "address":{
            "city":"Pittsburgh",
            "state":"Pennsylvania"
         },
         "firstName":"John",
         "lastName":"Doe"
      }
   },
   {
      "id":2,
      "user":{
         "address":{
            "city":"Pittsburgh",
            "state":"Pennsylvania"
         },
         "firstName":"Jane",
         "lastName":"Doe"
      }
   }
]

db<>fiddle

Upvotes: 1

Related Questions