Jay Nani
Jay Nani

Reputation: 105

How Parse data between unknown strings

I have a text Column with data as below

RawDataColumn
THANK 1000 1500 1740 1  YOU 1000 1740 1820 1  ABC 1000 1820 1960 1  XYZABC 1000 1960 2240 1  DFGS 1000 2240 2380 1  THINK 1000 2380 2480 1

I want to parse the Text column to multple columns as below

Word   A     B    C   D
THANK 1000 1500 1740 1 
YOU   1000 1740 1820 1
ABC   1000 1820 1960 1
XYZA  1000 1960 2240 1
DFGS 1000 2240 2380 1
THINK 1000 2380 2480 1

SQL Server Version : SQL Server 2016

Upvotes: 0

Views: 75

Answers (3)

Jay Nani
Jay Nani

Reputation: 105

create table test (RawDataColumn varchar(2000))
insert into test values('THANK 1000 1500 1740 1  YOU 1000 1740 1820 1  ABC 1000 1820 1960 1  XYZABC 1000 1960 2240 1  DFGS 1000 2240 2380 1  THINK 1000 2380 2480 1')
;with mycte as (

Select value as val1 from test
Cross apply String_split( replace(RawDataColumn,'  ','|'),'|')
)




Select    Max(Case when rn=1 then value end) word
, Max(Case when rn=2 then value end) A
, Max(Case when rn=3 then value end) B
, Max(Case when rn=4 then value end) C
, Max(Case when rn=5 then value end) D
from mycte  s
Cross apply (
SELECT ss.[value], ROW_NUMBER() OVER (PARTITION BY s.val1 ORDER BY s.val1 ) AS rn
FROM string_Split(val1,' ') AS ss
) as d
Group by s.val1


drop table test

Upvotes: 0

jigga
jigga

Reputation: 614

In regards to my comment, this is one way to do this (not my best work :D )

CREATE FUNCTION dbo.Split
(
    @string nvarchar(max)
)
RETURNS @result TABLE (Word nvarchar(max), A int, B int, C int, D int)
AS
 BEGIN

    DECLARE @sub nvarchar(max)
    DECLARE @Word nvarchar(max)
    DECLARE @A int
    DECLARE @B int
    DECLARE @C int
    DECLARE @D int

    IF @string IS NULL 
     BEGIN
        INSERT INTO @result VALUES(NULL, NULL, NULL, NULL, NULL)
     END

    ELSE
     BEGIN
        WHILE LEN(@string) > 0
         BEGIN
            IF @string LIKE '% [A-Z]%'
             BEGIN
                SET @sub = SUBSTRING(@string, 0, PATINDEX('% [A-Z]%',  @string))
             END
            ELSE
             BEGIN
                SET @sub = @string
             END

            SET @string = LTRIM(RTRIM(RIGHT(@string, LEN(@string) - LEN(@sub))))
            SET @Word = LEFT(@sub, CHARINDEX(' ', @sub) - 1)

            SET @sub = SUBSTRING(@sub, CHARINDEX(' ', @sub) + 1, LEN(@sub))
            SET @A = LEFT(@sub, CHARINDEX(' ', @sub))

            SET @sub = SUBSTRING(@sub, CHARINDEX(' ', @sub) + 1, LEN(@sub))
            SET @B = LEFT(@sub, CHARINDEX(' ', @sub))

            SET @sub = SUBSTRING(@sub, CHARINDEX(' ', @sub) + 1, LEN(@sub))
            SET @C = LEFT(@sub, CHARINDEX(' ', @sub))

            SET @D = SUBSTRING(@sub, CHARINDEX(' ', @sub) + 1, LEN(@sub))

            INSERT INTO @result VALUES(@Word, @A, @B, @C, @D)
         END
     END
    RETURN  
 END

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522471

SQL Server is not the best place to handle such text scrubbing requirements. I will give a Python script which can generate a text file with clearly defined lines:

inp = "THANK 1000 1500 1740 1  YOU 1000 1740 1820 1  ABC 1000 1820 1960 1  XYZABC 1000 1960 2240 1  DFGS 1000 2240 2380 1  THINK 1000 2380 2480 1"
lines = re.findall(r'\S+ \d+ \d+ \d+ \d+', inp)
f = open('output.txt', 'w')
for line in lines:
    f.write(line + '\n')
f.close()

Now the output file output.txt should have proper lines of data, separated by space for each column. You may try a similar approach with really any other language, and then import into SQL Server.

Upvotes: 1

Related Questions