Splitting multiple fields by delimiter

Question

I have to write an SP that can perform Partial Updates on our databases, the changes are stored in a record of the PU table. A values fields contains all values, delimited by a fixed delimiter. A tables field refers to a Schemes table containing the column names for each table in a similar fashion in a Colums fiels.

Now for my SP I need to split the Values field and Columns field in a temp table with Column/Value pairs, this happens for each record in the PU table.

An example:

Our PU table looks something like this:

CREATE TABLE [dbo].[PU](
    [Table] [nvarchar](50) NOT NULL,
    [Values] [nvarchar](max) NOT NULL
)

Insert SQL for this example:

INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','John Doe;26');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Jane Doe;22');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mike Johnson;20');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Person','Mary Jane;24');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Mathematics');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','English');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Course','Geography');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus A;Schools Road 1;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus B;Schools Road 31;Educationville');
INSERT INTO [dbo].[PU]([Table],[Values]) VALUES ('Campus','Campus C;Schools Road 22;Educationville');

And we have a Schemes table similar to this:

CREATE TABLE [dbo].[Schemes](
    [Table] [nvarchar](50) NOT NULL,
    [Columns] [nvarchar](max) NOT NULL
)

Insert SQL for this example:

INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Person','[Name];[Age]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Course','[Name]');
INSERT INTO [dbo].[Schemes]([Table],[Columns]) VALUES ('Campus','[Name];[Address];[City]');

As a result the first record of the PU table should result in a temp table like:

The 5th will have:

Finally, the 8th PU record should result in:

You get the idea. I tried use the following query to create the temp tables, but alas it fails when there's more that one value in the PU record:

DECLARE @Fields TABLE
(
    [Column] INT,
    [Value] VARCHAR(MAX)
)

INSERT INTO @Fields
    SELECT TOP 1
        (SELECT Value FROM STRING_SPLIT([dbo].[Schemes].[Columns], ';')), 
        (SELECT Value FROM STRING_SPLIT([dbo].[PU].[Values], ';'))
    FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]

TOP 1 correctly gets the first PU record as each PU record is removed once processed.

The error is:

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

In the case of a Person record, the splits are indeed returning 2 values/colums at a time, I just want to store the values in 2 records instead of getting an error.

Any help on rewriting the above query?

Also do note that the data is just generic nonsense. Being able to have 2 fields that both have delimited values, always equal in amount (e.g. a 'person' in the PU table will always have 2 delimited values in the field), and break them up in several column/header rows is the point of the question.

UPDATE: Working implementation

Based on the (accepted) answer of Sean Lange, I was able to work out followin implementation to overcome the issue:

As I need to reuse it, the combine column/value functionality is performed by a new function, declared as such:

CREATE FUNCTION [dbo].[JoinDelimitedColumnValue]
        (@splitValues VARCHAR(8000), @splitColumns VARCHAR(8000),@pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
  WITH MyValues AS
(
    SELECT ColumnPosition = x.ItemNumber,
        ColumnValue = x.Item
    FROM  dbo.DelimitedSplit8K(@splitValues, @pDelimiter) x
)

, ColumnData AS
(
    SELECT ColumnPosition = x.ItemNumber,
        ColumnName = x.Item
    FROM  dbo.DelimitedSplit8K(@splitColumns, @pDelimiter) x
)

SELECT cd.ColumnName,
    v.ColumnValue
FROM MyValues v
JOIN ColumnData cd ON cd.ColumnPosition = v.ColumnPosition
;

In case of the above sample data, I'd call this function with the following SQL:

DECLARE @FieldValues VARCHAR(8000), @FieldColumns VARCHAR(8000)
SELECT TOP 1 @FieldValues=[dbo].[PU].[Values], @FieldColumns=[dbo].[Schemes].[Columns] FROM [dbo].[PU] INNER JOIN [dbo].[Schemes] ON [dbo].[PU].[Table] = [dbo].[Schemes].[Table]

INSERT INTO @Fields
SELECT [Column] = x.[ColumnName],[Value] = x.[ColumnValue] FROM [dbo].[JoinDelimitedColumnValue](@FieldValues, @FieldColumns, @Delimiter) x

Sean Lange · Accepted Answer

This data structure makes this way more complicated than it should be. You can leverage the splitter from Jeff Moden here. http://www.sqlservercentral.com/articles/Tally+Table/72993/ The main difference of that splitter and all the others is that his returns the ordinal position of each element. Why all the other splitters don't do this is beyond me. For things like this it is needed. You have two sets of delimited data and you must ensure that they are both reassembled in the correct order.

The biggest issue I see is that you don't have anything in your main table to function as an anchor for ordering the results correctly. You need something, even an identity to ensure the output rows stay "together". To accomplish I just added an identity to the PU table.

alter table PU add RowOrder int identity not null

Now that we have an anchor this is still a little cumbersome for what should be a simple query but it is achievable.

Something like this will now work.

with MyValues as
(
    select p.[Table]
        , ColumnPosition = x.ItemNumber
        , ColumnValue = x.Item
        , RowOrder
    from PU p
    cross apply dbo.DelimitedSplit8K(p.[Values], ';') x
)

, ColumnData as
(
    select ColumnName = replace(replace(x.Item, ']', ''), '[', '') 
        , ColumnPosition = x.ItemNumber
        , s.[Table]
    from Schemes s
    cross apply dbo.DelimitedSplit8K(s.Columns, ';') x
)

select cd.[Table]
    , v.ColumnValue
    , cd.ColumnName
from MyValues v
join ColumnData cd on cd.[Table] = v.[Table] 
    and cd.ColumnPosition = v.ColumnPosition
order by v.RowOrder
    , v.ColumnPosition

Splitting multiple fields by delimiter

UPDATE: Working implementation

Answers (2)

Related Questions