reddy0505
reddy0505

Reputation: 23

Split one column into multiple column

I’m working on a power bi audit log report file. The file contains a column 'AuditDate' and it has multiple columns in it. I need to split that column into multiple columns using sql.

The column has values like this

AuditDate
------------
"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""[email protected]"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"

Basically I need to split this column into

id     RecordType      CreationTime    Operaration     OrganizationID  UserType
------------------------------------------------------------------------------
44de2468    20     2018-08-03T12:30:34   ViewReport     779558               0

Can anyone help with the sql query for this?

Upvotes: 1

Views: 341

Answers (3)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

With SQL-Server 2016 this is pretty simple. There is quite some JSON support. Your only problem is, that your string is not correct. Quite obviously there was an engine doubling all internal quotes (an escaping technique).

If this is under your control, you should try to change your column's format to proper JSON. Even better let the writing application supply these audits in a correct JSON format. At least you might add a second column and use a trigger to keep this synchronised. As a last resort you can use REPLACE to repair your string:

REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');

With a lot of rows this might take a while... That's why it's better to keep the format in proper JSON.

Just to show the principles:

DECLARE @YourString NVARCHAR(MAX)=N'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""[email protected]"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"';

SET @YourString = REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');

Your string will look like this now:

{"Id":"44de2468","RecordType":20,"CreationTime":"2018-08-03T12:30:34","Operation":"ViewReport","OrganizationId":"779558","UserType":0,"UserKey":"FFFA3DA","Workload":"PowerBI","UserId":"[email protected]","ClientIP":"9.5.3.26","UserAgent":"Mozilla\/5.0 (Windows NT 10.0;","Activity":"ViewReport","ItemName":"Sales","WorkSpaceName":"TeamITO","DatasetName":"Sales1","ReportName":"Sales1","WorkspaceId":"e8eaa0ca","ObjectId":"Sales1","DatasetId":"4c5d-ad45-eb6546","ReportId":"4cb0-99ad-de41b5160c47","IsSuccess":true,"DatapoolRefreshScheduleType":"None","DatapoolType":"Undefined"}

This query will return all your columns as a drived list:

SELECT * 
FROM OPENJSON(@YourString);

The result brings back a list with a type hint (while the actual type of "value" is nvarchar):

+-----------------------------+-------------------------------+------+
| key                         | value                         | type |
+-----------------------------+-------------------------------+------+
| Id                          | 44de2468                      | 1    |
+-----------------------------+-------------------------------+------+
| RecordType                  | 20                            | 2    |
+-----------------------------+-------------------------------+------+
| CreationTime                | 2018-08-03T12:30:34           | 1    |
+-----------------------------+-------------------------------+------+
| Operation                   | ViewReport                    | 1    |
+-----------------------------+-------------------------------+------+
| OrganizationId              | 779558                        | 1    |
+-----------------------------+-------------------------------+------+
| UserType                    | 0                             | 2    |
+-----------------------------+-------------------------------+------+
| UserKey                     | FFFA3DA                       | 1    |
+-----------------------------+-------------------------------+------+
| Workload                    | PowerBI                       | 1    |
+-----------------------------+-------------------------------+------+
| UserId                      | [email protected]                  | 1    |
+-----------------------------+-------------------------------+------+
| ClientIP                    | 9.5.3.26                      | 1    |
+-----------------------------+-------------------------------+------+
| UserAgent                   | Mozilla/5.0 (Windows NT 10.0; | 1    |
+-----------------------------+-------------------------------+------+
| Activity                    | ViewReport                    | 1    |
+-----------------------------+-------------------------------+------+
| ItemName                    | Sales                         | 1    |
+-----------------------------+-------------------------------+------+
| WorkSpaceName               | TeamITO                       | 1    |
+-----------------------------+-------------------------------+------+
| DatasetName                 | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| ReportName                  | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| WorkspaceId                 | e8eaa0ca                      | 1    |
+-----------------------------+-------------------------------+------+
| ObjectId                    | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| DatasetId                   | 4c5d-ad45-eb6546              | 1    |
+-----------------------------+-------------------------------+------+
| ReportId                    | 4cb0-99ad-de41b5160c47        | 1    |
+-----------------------------+-------------------------------+------+
| IsSuccess                   | true                          | 3    |
+-----------------------------+-------------------------------+------+
| DatapoolRefreshScheduleType | None                          | 1    |
+-----------------------------+-------------------------------+------+
| DatapoolType                | Undefined                     | 1    |
+-----------------------------+-------------------------------+------+

And even better, you can add a WITH clause like here:

SELECT * 
FROM OPENJSON(@YourString)
WITH 
(   
    Id             varchar(200)  '$.Id',  
    RecordType     int           '$.RecordType',  
    CreationTime   datetime      '$.CreationTime'
    --Add all your known columns here...
)

Doing so you get your values typed and side-by-side

+----------+------------+-------------------------+
| Id       | RecordType | CreationTime            |
+----------+------------+-------------------------+
| 44de2468 | 20         | 2018-08-03 12:30:34.000 |
+----------+------------+-------------------------+

Upvotes: 1

Eric Brandt
Eric Brandt

Reputation: 8101

It looks like you're dealing with a malformed JSON column here. Those double-double quotes are a bother.

If you can clean up the formatting, though, you can just use JSON functions in your query.

First, set up the data (using the data you provided in your other copy of this question,(Split column values into multiple columns):

DECLARE @t TABLE 
(
  RecordType NVARCHAR(20)
  ,AuditDate NVARCHAR(MAX)
);
INSERT @t
  (
    RecordType
    ,AuditDate
  )
VALUES
  ('View', '{""Id"":""44de2468"",""Type"":20,""CreationDate"":""2018-08-23""}')
 ,('Edit', '{""Id"":""44de2467"",""Type"":40,""CreationDate"":""2018-08-24""}')
 ,('Print', '{""Id"":""44de2768"",""Type"":60,""CreationDate"":""2018-05-06""}')
 ,('Delete', '{""Id"":""44de2488"",""Type"":30,""CreationDate"":""2018-07-20""}');

Now, clean up the malformed JSONs by replacing the double-double quotes with single double quotes.

UPDATE @t
SET AuditDate = REPLACE(AuditDate,'""','"');

Validate the JSONs look good.

SELECT * FROM @t

--Results:
+------------+---------------------------------------------------------+
| RecordType |                        AuditDate                        |
+------------+---------------------------------------------------------+
| View       | {"Id":"44de2468","Type":20,"CreationDate":"2018-08-23"} |
| Edit       | {"Id":"44de2467","Type":40,"CreationDate":"2018-08-24"} |
| Print      | {"Id":"44de2768","Type":60,"CreationDate":"2018-05-06"} |
| Delete     | {"Id":"44de2488","Type":30,"CreationDate":"2018-07-20"} |
+------------+---------------------------------------------------------+

Then use JSON_VALUE() to extract the parts you're interested in.

SELECT 
    RecordType
  , JSON_VALUE(AuditDate, '$.Id') AS [Id]
  , JSON_VALUE(AuditDate, '$.Type') AS [Type]
  , JSON_VALUE(AuditDate, '$.CreationDate') AS CreationDate
FROM @t

--Results
+------------+----------+------+--------------+
| RecordType |    Id    | Type | CreationDate |
+------------+----------+------+--------------+
| View       | 44de2468 |   20 | 2018-08-23   |
| Edit       | 44de2467 |   40 | 2018-08-24   |
| Print      | 44de2768 |   60 | 2018-05-06   |
| Delete     | 44de2488 |   30 | 2018-07-20   |
+------------+----------+------+--------------+

Upvotes: 2

Alan Burstein
Alan Burstein

Reputation: 7918

This is pretty simple, all you need is a string "splitter" (AKA tokenizer). If you are on SQL 2016+ you can use STRING_SPLIT; if you're on a pre-2016 system you can use DelimitedSplit8K on 2005+ or DelimitedSplit8K_LEAD on 2012+. The solution would look like this:

DECLARE @AuditDate VARCHAR(8000) = 
'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""[email protected]"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"'

SELECT 
  Id             = MAX(CASE split.attrib WHEN 'ID'             THEN split.val END),
  RecordType     = MAX(CASE split.attrib WHEN 'RecordType'     THEN split.val END),
  CreationTime   = MAX(CASE split.attrib WHEN 'CreationTime'   THEN split.val END),
  Operation      = MAX(CASE split.attrib WHEN 'Operation'      THEN split.val END),
  OrganizationId = MAX(CASE split.attrib WHEN 'OrganizationId' THEN split.val END),
  UserType       = MAX(CASE split.attrib WHEN 'UserType'       THEN split.val END)
FROM
(
  SELECT      attrib = REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"',''),
              val    = REPLACE(REPLACE(SUBSTRING(split.value, mid.point+1, 8000),'{',''),'"','')
  FROM        STRING_SPLIT(@AuditDate,',') AS split
  CROSS APPLY (VALUES(CHARINDEX(':', split.value))) AS mid(point)
  WHERE       REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"','') IN
                ('id','RecordType','CreationTime','Operation','OrganizationID','UserType')
) AS split;

Results:

Id         RecordType  CreationTime          Operation   OrganizationId  UserType
---------- ----------- --------------------- ----------- --------------- ---------
44de2468   20          2018-08-03T12:30:34   ViewReport  779558          0

Upvotes: 2

Related Questions