pju
pju

Reputation: 11

Using STRING_SPLIT in SQL for delimited values

@InStr = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'

I have the string above in a variable @InStr and I want to use STRING_SPLIT to inserts values into a table.

As you can see its a double split.

SELECT Value FROM STRING_SPLIT(@InStr,'^') 

Produces:

0|ABC|3033.9|3032.4444|0|0|0
1|DEF|3033.2577|3033.053|3032.0808|0|0
2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0
3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0

Which is good, now I need to take each row and insert into a table.

I'm not sure how to combine the 2 splits to do the insert. The table has 7 columns which it would populate.

Any help appreciated.

Upvotes: 1

Views: 5053

Answers (5)

Aaron Bertrand
Aaron Bertrand

Reputation: 280644

Instead of passing a string from .NET like 'a|b|c^d|e|f' and then having to parse it, leave it in its original structure (DataTable?) and create a table type in SQL Server. Then you can pass in your structure instead of this cobbled-together string.

In SQL Server:

CREATE TYPE dbo.MyTableType AS TABLE
(
  ColumnA int,
  ColumnB nvarchar(32), 
  ...
);
GO

CREATE PROCEDURE dbo.ShowArray
  @DataTable dbo.MyTableType
AS
BEGIN
  SET NOCOUNT ON;

  SELECT ColumnA, ColumnB, ...
    FROM @DataTable;
END

In C# (untested and incomplete):

DataTable dt = new DataTable();
dt.Columns.Add("ColumnA", typeof(Int32));
dt.Columns.Add("ColumnB", typeof(String));
...

DataRow dr = dt.NewRow();  
dr[0] = 1;  
dr[1] = "foo";
...

dt.Rows.Add(dr);
...

  SqlCommand cmd = new SqlCommand("dbo.ShowArray", connectionObject);
  cmd.CommandType = CommandType.StoredProcedure;
  SqlParameter tvp1 = c2.Parameters.AddWithValue("@DataTable", dt);
  tvp1.SqlDbType = SqlDbType.Structured;
  ...

More on this shift away from splitting strings here and, actually, in this answer as well:

Upvotes: 2

Venkataraman R
Venkataraman R

Reputation: 13009

I am generating INSERT statement and then executing it. First I am splitting the string and then I am generating INSERT statement.

Note:

  • I am assuming that second column will be three letter code.
  • I am assuming that sort order of rows doesn't matter
declare @instr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'
;
declare @insertStmt VARCHAR(max) ='INSERT INTO TABLEName VALUES '+ CHAR(13) + CHAR(10);
SELECT @insertStmt +=  CONCAT('(',replace(stuff(stuff(value,3,0,''''),7,0,''''),'|',','),'),') 
from STRING_SPLIT(@instr,'^') 
SELECT @insertStmt = STUFF(@insertStmt,len(@insertStmt),1,'')
select @insertStmt
EXEC(@insertStmt)
INSERT INTO TABLEName VALUES 
(0,'ABC',3033.9,3032.4444,0,0,0),(1,'DEF',3033.2577,3033.053,3032.0808,0,0),(2,'JHI',3032.8376,3033.2596,3033.2259,3033.322,0),(3,'XYZ',3032.8376,3032.8376,3032.8376,3032.8376,0)

Upvotes: 0

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67341

First of all: You should avoid STRING_SPLIT() in almost any case. It does not guarantee to return the items in the expected sort order. This might work in all your tests and break in production with silly hardly to find errors.

There are various answers already, the best one should be the table type parameter. But (if you cannot follow this route), I'd like to suggest two type-safe approaches:

DECLARE @InStr NVARCHAR(MAX) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0';

--xml approach (working for almost any version)
--We do the double split in one single action and return a nested XML with <x> and <y> elements
--We can fetch the values type-safe from their 1-based position:

SELECT x.value('y[1]','int')           AS [First]
      ,x.value('y[2]','varchar(100)')  AS [Second]
      ,x.value('y[3]','decimal(28,8)') AS Third
      ,x.value('y[4]','decimal(28,8)') AS Fourth
      ,x.value('y[5]','decimal(28,8)') AS Fifth
      ,x.value('y[6]','decimal(28,8)') AS Sixth
      ,x.value('y[7]','decimal(28,8)') AS Seventh
FROM (VALUES(CAST('<x><y>' + REPLACE(REPLACE(@Instr,'|','</y><y>'),'^','</y></x><x><y>') + '</y></x>' AS XML)))v(Casted)
CROSS APPLY Casted.nodes('/x') b(x);

--json approach (needs v2016+)
--faster than XML
--We transform your string to a JSON-array with one item per row and use another OPENJSON to retrieve the array's items.
--The WITH-clause brings in implicit pivoting to retrieve the items type-safe as columns:

SELECT b.*
FROM OPENJSON(CONCAT('[["',REPLACE(@Instr,'^','"],["'),'"]]')) a
CROSS APPLY OPENJSON(CONCAT('[',REPLACE(a.[value],'|','","'),']'))
WITH([First]   INT           '$[0]'
    ,[Second]  VARCHAR(100)  '$[1]'
    ,[Third]   DECIMAL(28,8) '$[2]'
    ,[Fourth]  DECIMAL(28,8) '$[3]'
    ,[Fifth]   DECIMAL(28,8) '$[4]'
    ,[Sixth]   DECIMAL(28,8) '$[5]'
    ,[Seventh] DECIMAL(28,8) '$[6]') b;

Both approaches return the same result:

+-------+--------+---------------+---------------+---------------+---------------+------------+
| First | Second | Third         | Fourth        | Fifth         | Sixth         | Seventh    |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 0     | ABC    | 3033.90000000 | 3032.44440000 | 0.00000000    | 0.00000000    | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 1     | DEF    | 3033.25770000 | 3033.05300000 | 3032.08080000 | 0.00000000    | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 2     | JHI    | 3032.83760000 | 3033.25960000 | 3033.22590000 | 3033.32200000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+
| 3     | XYZ    | 3032.83760000 | 3032.83760000 | 3032.83760000 | 3032.83760000 | 0.00000000 |
+-------+--------+---------------+---------------+---------------+---------------+------------+

Upvotes: 2

Roger Wolf
Roger Wolf

Reputation: 7722

For a subsequent splitting of pipe-separated substrings you can utilise openjson(), as demonstrated in the example below:

declare @InStr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0';

select p.*
from (
    select ss.value as [RowId], oj.[key] as [ColumnId], oj.value as [ColumnValue]
    from string_split(@InStr,'^') ss
        cross apply openjson('["' + replace(ss.value, '|', '","') + '"]', '$') oj
) q
pivot (
    min(q.ColumnValue)
    for q.[ColumnId] in ([0], [1], [2], [3], [4], [5], [6])
) p;

There are many caveats with this approach, however. The most prominent are:

  1. You need SQL Server 2016 or later, and the database compatibility level needs to be 130 or above;
  2. If your data is of any size worth mentioning (1Mb+), this code might work unacceptably slow. String manipulation is not the strong point of SQL Server.

Personally, I would recommend parsing this string outside of SQL. If it's a flat file you are importing, SSIS dataflow will be much easier to develop and faster to work. If it's an application, then redesign it to pass either a suitable table type, or XML / JSON blob at the very least.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271151

You can use a recursive CTE:

declare @instr varchar(max) = '0|ABC|3033.9|3032.4444|0|0|0^1|DEF|3033.2577|3033.053|3032.0808|0|0^2|JHI|3032.8376|3033.2596|3033.2259|3033.322|0^3|XYZ|3032.8376|3032.8376|3032.8376|3032.8376|0'
;
with cte as (
      select row_number() over (order by (select null)) as id, convert(varchar(max), null) as el, Value + '|' as rest, 0 as lev
      from string_split(@InStr, '^')
      union all
      select id, left(rest, charindex('|', rest) - 1),
             stuff(rest, 1, charindex('|', rest), ''),
             lev + 1
      from cte
      where rest <> ''
     )
select max(case when lev = 1 then el end),
       max(case when lev = 2 then el end),
       max(case when lev = 3 then el end),
       max(case when lev = 4 then el end),
       max(case when lev = 5 then el end),
       max(case when lev = 6 then el end),
       max(case when lev = 7 then el end)
from cte
group by id;

Here is a db<>fiddle.

Unfortunately, you can't safely use string_split() because it does not provide the offset for the values returned.

Upvotes: 1

Related Questions