Dave
Dave

Reputation: 35

How to Split A Value by Delimiter Into Different Rows?

I'm a newbie and have had a problem that I searched these forums and found that I needed to have UDF defined function to split a value because SQL Server does not have a built-in function to split a value based on a delimiter. Based on an article (http://www.codeproject.com/Articles/7938/SQL-User-Defined-Function-to-Parse-a-Delimited-Str), I was able to implement the following but it's not working well from a performance perspective. I have about 75000 records and it's taking a long time to run.

The end result is that I need to take all the purchase values (delimitted by |) and put each value with the name into a table. I may at times have 4 values separated by the | or 3,2 none.

Can someone provide some pointers or a different solution? Maybe do this split using a solution like AutoIT off-line? Or in SSIS?

I'm using SQL Server 2012 and SSIS to load the data.

Help!

Thanks

Dave

BEGIN SET NOCOUNT ON;

DECLARE @Staging_Table TABLE
(
 ACCTID INT IDENTITY(1,1),
 NAME VARCHAR(50),
 PURCHASES VARCHAR(255)
 )

INSERT INTO @Staging_Table (Name, Purchases)
VALUES ('John','Vanilla|Chocolate|Peach')

INSERT INTO @Staging_Table (Name, Purchases)
VALUES ('Jack','Chocolate|Vanilla')

INSERT INTO @Staging_Table (Name, Purchases)
VALUES ('Mary','Peach|Vanilla|Bean')

INSERT INTO @Staging_Table (Name, Purchases)
VALUES ('Peter','Vanilla|Peach')

INSERT INTO @Staging_Table (Name, Purchases)
VALUES ('Jane','Bean|Vanilla|Chocolate|Peach')

-- Get the number of rows in the looping table
 DECLARE @RowCount INT
 SET @RowCount = (SELECT COUNT(ACCTID) FROM @Staging_Table)



-- Declare an iterator
DECLARE @I INT
-- Initialize the iterator
 SET @I = 1


-- Loop through the rows of a table @myTable
WHILE (@I <= @RowCount)
BEGIN
    -- Declare variables to hold the data which we get after looping each record

     DECLARE @NAME VARCHAR(255), @PURCHASES VARCHAR(255)  

     -- Get the data from table and set to variables
     SELECT 
        @NAME = NAME, 
        @PURCHASES = PURCHASES
    FROM 
        @Staging_Table
    WHERE 
        ACCTID = @I

    -- Display the looped data
        SELECT 
            @I,
            @NAME,  
            t.txt_value
            FROM dbo.fn_ParseText2Table(@PURCHASES, '|') as t 

            SET @I = @I  + 1
     END



END

Upvotes: 1

Views: 1766

Answers (3)

Vasily
Vasily

Reputation: 5782

        DECLARE @Tally TABLE (N INT)
        DECLARE @i AS INT = 1
        WHILE @i != 1000
        BEGIN
            INSERT INTO @Tally (N)  VALUES (@i)
            SET @i = @i + 1
        END
        --------------------------------------------------------
        DECLARE @Staging_Table TABLE 
                    (ACCTID INT IDENTITY(1, 1)
                    ,NAME VARCHAR(50)
                    ,PURCHASES VARCHAR(255))
        INSERT INTO @Staging_Table (NAME,Purchases)
        VALUES ('John','Vanilla|Chocolate|Peach')
              ,('Jack','Chocolate|Vanilla')
              ,('Mary','Peach|Vanilla|Bean')
              ,('Peter','Vanilla|Peach')
              ,('Jane','Bean|Vanilla|Chocolate|Peach')
              ,('Jane','Bean Vanilla Chocolate Peach')

    ----------------------------------------------------------
    --1 variant:

        SELECT E.NAME,f3.Purch
        FROM @Staging_Table AS E
        INNER JOIN @Tally AS T ON SUBSTRING('|' + E.Purchases, T.N, 1) = '|' AND T.N < LEN(E.Purchases) + 1
        CROSS APPLY (SELECT string = SUBSTRING(' ' + E.Purchases + '|', T.N + 1, LEN(E.Purchases) + 1)) f1
        CROSS APPLY (SELECT p1 = CHARINDEX('|', string)) f2
        CROSS APPLY (SELECT Purch = SUBSTRING(E.Purchases, T.N, p1-1)) f3
        ORDER BY E.NAME
    ----------------------------------------------------------
    --2 variant:

        SELECT  E.NAME ,
        ( CASE WHEN CHARINDEX('|', S.string) > 0
               THEN LEFT(S.string, CHARINDEX('|', S.string) - 1)
               ELSE string
          END ) AS Purch
        FROM    @Staging_Table AS E
        INNER JOIN @Tally AS T ON SUBSTRING('|' + PURCHASES, T.N, 1) = '|'
                                  AND T.N <= LEN(PURCHASES)
        CROSS APPLY ( SELECT    String = ( CASE WHEN T.N = 1
                                                THEN ( CASE WHEN CHARINDEX('|',
                                                              E.PURCHASES) > 0
                                                            THEN LEFT(E.PURCHASES,
                                                              CHARINDEX('|',
                                                              E.PURCHASES) - 1)
                                                            ELSE E.PURCHASES
                                                       END )
                                                ELSE SUBSTRING(E.PURCHASES,
                                                              T.N, 1000)
                                           END )
                    ) S
        ORDER BY E.NAME

Upvotes: 1

Roger Wolf
Roger Wolf

Reputation: 7722

SQL Server is slow with string manipulation, so any SQL implementation will be horrible, performance-wise. If you are truly into the speed, you will have to create a CLR function - being properly written, it will beat any SQL on this particular task, hands down.

Another option, which might or might not be a viable solution in your case, is SSIS package.

Meanwhile, you can try to use XML for row splitting - it might be even worse than loops, though (not to mention many other deficiencies):

select t.ACCTID, t.NAME, p.c.value('./@v', 'varchar(50)') as [PurchaseItem]
from @Staging_Table t
    cross apply (
        select cast('<r><i v="'
            + replace(t.PURCHASES, '|', '"/><i v="')
            + '" /></r>' as xml
        ) as [XData]
    ) ca
    cross apply ca.XData.nodes('/r[1]/i') p(c);

Upvotes: 0

yxre
yxre

Reputation: 3704

It is bad practice to store data like this, but I will tell you a solution I have found that works. Ideally, you would do this outside of the SQL server. This works fast enough for me.

CREATE TABLE #tempValues (
    name VARCHAR(100), 
    val VARCHAR(100)
);

Create a temporary table to hold the data. The solution is a recursive function takes the left hand side, and then recalls the functions on the right hand side.

CREATE FUNCTION breakStringUp
    @nameTemp VARCHAR(100),
    @stringToSplit VARCHAR(100)
BEGIN
     IF FIND(@stringToSplit, '|') > 0
     BEGIN
        INSERT INTO #tempValues
        ( name, val)
        VALUES
        (@nameTemp, LEFT(@stringToSplit,
                             FIND(@stringToSplit, '|'));

         breakStringUp(@nameTemp,
                 RIGHT( /* Recurse into the function, The index is 
                                                 taken from the right side so invert it
                                                 LEN(@stringToSpit)-FIND(...) */)
                )
      END
      ELSE
      BEGIN
             INSERT INTO #tempValues ..............
      END

The above code is a little complicated. I just wrote the pseudocode, but it should work depending on the schema.

  DECLARE CURSOR FOR ........

Declare a cursor over the row and iterate. Call the function on each name and corresponding VARCHAR.

  SELECT * FROM #tempValues;

And clean up the temporary table

DROP #tempValues;

Upvotes: 0

Related Questions