Reputation: 13
Ok, here we go. My first ever post and question.
I'm using SQL server 2019. I have a table, im interested in two columns from this table.
Both columns contain array-like data (they are both a set of strings, comma delimited. I want to take each substring from the columns and show them alone in their own 'cell' and instead have many rows.
Now, I can get it to work with one of the columns using STRING_SPLIT. However, when I use string split on both columns, I dont get the order I need. i.e for example see below
column1: 'my_fav_drink','my_fav_colour','my_fav_avenger'
column2: 'coffee','blue','dr_strange'
The position of the strings will always match, so the 3rd entry in column 1 will always relate to the 3rd entry in column 2.
But when I run the string split, the orders get messed up
I have tried self joins, sub selects and messed around with order by's but I just can't seem to get it working in order.
I have searched online for while, so Im not just coming to the community without first trying, and I will appreciate any help or guidance, and I looked here for the answer of course but I have not had any luck.
Update: Please consider the following:
This was not built by me, I'm diving into this without a tech specification having to go through the DB, i'm sure some of you will know my pain. If I get approval, I will re-design this but I can't for now (I could make another table though). The data comes in automatically from a machine like this as a list of strings
There are about 50 values in each column
Desired output is:
The whole point of this is so that I can do analysis in another software (again, was a request) but the array can't be sorted out in the software.
Thanks guys,
Much appreciated
Upvotes: 0
Views: 1140
Reputation: 11
SELECT d.column1,b.value as Label
FROM Data d
CROSS APPLY string_split(cast(d.column1 as varchar(max)),',')b
SELECT d.column2,c.value as Value
FROM Data d
CROSS APPLY string_split(cast(d.column2 as varchar(max)),',')c
Try this approach
Upvotes: 1
Reputation: 29983
Probably the best solution is to change the design, but if you need to parse the input data, you may a try a JSON-based approach. The idea is to transform the values from column1
and column2
columns into valid JSON arrays (my_fav_drink, my_fav_colour, my_fav_avenger
into ["my_fav_drink", "my_fav_colour", "my_fav_avenger"]
) and parse the arrays using OPENJSON()
and default schema. The result from the OPENJSON()
execution is a table with columns key
, value
and type
and the values in the key
column are the indexes of the elements in the input array.
Table:
CREATE TABLE Data (
column1 varchar(100),
column2 varchar(100)
)
INSERT INTO Data (column1, column2)
VALUES ('my_fav_drink,my_fav_colour,my_fav_avenger', 'coffee,blue,dr_strange')
Statement:
SELECT CONVERT(int, j1.[key]) + 1 AS [Number], j1.[value] AS [Label], j2.[value] AS [Value]
FROM Data d
CROSS APPLY OPENJSON(CONCAT('["', REPLACE(STRING_ESCAPE(d.column1, 'json'), ',', '","'), '"]')) j1
CROSS APPLY OPENJSON(CONCAT('["', REPLACE(STRING_ESCAPE(d.column2, 'json'), ',', '","'), '"]')) j2
WHERE j1.[key] = j2.[key]
Result:
Number Label Value
1 my_fav_drink coffee
2 my_fav_colour blue
3 my_fav_avenger dr_strange
Upvotes: 1
Reputation: 1270361
This is a bit long for a comment.
Your method of storing values is really, really bad. For instance, guaranteeing that the two columns have the same number of values is troublesome. And then imagine what happens if one of the values could have a comma!
There are basically four alternatives.
NULL
values. That's fine. If you want to get fancy, you might have different sets of columns stored in different tables.NULL
.Note that your method of storing multiple values in a string with headers in another string is not on this list.
Upvotes: 0