user2749605
user2749605

Reputation: 65

SQL Character Replace function extend

I'am currently developing a program and i want to write a function which is accept a value in following format

"AAAA BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH"

I want to replace the spaces in above with "_" and need a output as showed following format (please note that this should happen only for string which is inside double quotes)

"AAAA_BBBB"  CCCC DDDD EEEE "FFFF_GGGG_HHHH"

So I got the correct answer to this in my previous question

DECLARE @In VARCHAR(50) = '"AAAA BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH"'

DECLARE @Quote SMALLINT = -1, @Index INT = 1, @Char CHAR(1)
WHILE @Index <= LEN(@In) BEGIN
    SET @Char = SUBSTRING(@In, @Index, 1)
    IF @Char = '"'
        SET @Quote = @Quote * -1
    IF @Char = ' ' AND @Quote > 0
        SET @In = STUFF(@In, @Index, 1, '_')
    SET @Index = @Index + 1
END

PRINT @In

and now I want to modify this with the following requirement

with the required format you can note that, underscore should be replace with the spaces which is only inside the enclosed double quotes, but now i get a wrong output if the user input was as following

"AAAA BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc

it gives output as

"AAAA_BBBB"  CCCC DDDD EEEE "FFFF_GGGG_HHHH_cccc

but it must give output only for the first enclosed double quote string,because other one is not closed, expected output should be as follows

"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc

Upvotes: 2

Views: 281

Answers (3)

Jason W
Jason W

Reputation: 13209

Here are the modifications I would make

  • Once all replacements have been made, exit existing the loop (added WHILE loop condition and how logic for the quote recognition behaves to facilitate the exit)
  • Remove dependencies on any length (CHAR(100), SUBSTRING(...,213))

Code

DECLARE @In VARCHAR(50) = '"AAAA BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH IIIII"'

DECLARE @Quote SMALLINT = 0, @Index INT = 1, @Char CHAR(1)
WHILE @Quote < 2 AND @Index <= LEN(@In) BEGIN
    SET @Char = SUBSTRING(@In, @Index, 1)
    IF @Char = '"'
        SET @Quote = @Quote + 1
    IF @Char = ' ' AND @Quote = 1
        SET @In = STUFF(@In, @Index, 1, '_')
    SET @Index = @Index + 1
END

PRINT @In

Output

"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH IIIII"

Upvotes: 2

GarethD
GarethD

Reputation: 69819

I'd approach this differently to your last answer, and avoid a loop. The first step is to generate a list of 50 sequential numbers (this number should match the length of your varchar). This will allow you to split the varchar into rows which are much easier to deal with:

DECLARE @s VARCHAR(50) = '"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc';

WITH Numbers (Number) AS -- 50 ROWS
(   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
    FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
            CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
)
SELECT  Number,
        Letter = SUBSTRING(@s, Number, 1)
FROM    Numbers;

This gives something like:

Number | Letter
-------+--------
   1   |   "
   2   |   A
   3   |   A

In a similar fashion you can use this to get 'Groups' of quotation marks:

DECLARE @s VARCHAR(50) = '"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc';

WITH Numbers (Number) AS -- 50 ROWS
(   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
    FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
            CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
)
SELECT  Number,
        GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM    Numbers 
WHERE   SUBSTRING(@s, Number, 1) = '"'

Which gives:

Number | GroupID
-------+--------
   1   |    1
  11   |    1
  29   |    2

You can the group by GroupID to get the start and end position of your groups of quotation marks:

DECLARE @s VARCHAR(50) = '"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc';

WITH Numbers (Number) AS -- 50 ROWS
(   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
    FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
            CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Grouping AS
(   SELECT  Number,
            GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
    FROM    Numbers 
    WHERE   SUBSTRING(@s, Number, 1) = '"'
)
SELECT  StartPosition = MIN(Number),
        EndPosition = MAX(Number)
FROM    Grouping
GROUP BY GroupID
HAVING COUNT(*) = 2 -- ONLY INCLUDE WHERE THERE IS A START AND AN END

Now you can take this back to your original split, and replace any spaces in this range with an underscore:

DECLARE @s VARCHAR(50) = '"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc';

WITH Numbers (Number) AS -- 50 ROWS
(   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
    FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
            CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Letters AS
(   SELECT  Number,
            Letter = SUBSTRING(@s, Number, 1)
    FROM    Numbers
), Grouping AS
(   SELECT  Number,
            GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
    FROM    Letters 
    WHERE   Letter = '"'
), Groups AS
(   SELECT  StartPosition = MIN(Number),
            EndPosition = MAX(Number)
    FROM    Grouping
    GROUP BY GroupID
    HAVING COUNT(*) = 2
)
SELECT  l.Number,
        Letter = CASE WHEN g.StartPosition IS NOT NULL AND l.Letter = ' ' THEN '_' ELSE l.Letter END
FROM    Letters AS l
        LEFT JOIN Groups AS g
            ON g.StartPosition <= l.Number
            AND g.EndPosition >= l.Number;

Which gives:

Number | Letter
-------+--------
   1   |   "
   2   |   A
   3   |   A
   4   |   A
   5   |   A
   6   |   _
   7   |   B
   8   |   B
   9   |   B
  10   |   B
  11   |   "
  12   |   

Then you can rebuild your original string using FOR XML concatenation:

DECLARE @s VARCHAR(50) = '"AAAA_BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH cccc';

WITH Numbers (Number) AS -- 50 ROWS
(   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
    FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
            CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Letters AS
(   SELECT  Number,
            Letter = SUBSTRING(@s, Number, 1)
    FROM    Numbers
), Grouping AS
(   SELECT  Number,
            GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
    FROM    Letters 
    WHERE   Letter = '"'
), Groups AS
(   SELECT  StartPosition = MIN(Number),
            EndPosition = MAX(Number)
    FROM    Grouping
    GROUP BY GroupID
    HAVING COUNT(*) = 2
)
SELECT  String = (  SELECT  CASE WHEN g.StartPosition IS NOT NULL AND l.Letter = ' ' THEN '_' ELSE l.Letter END
                    FROM    Letters AS l
                            LEFT JOIN Groups AS g
                                ON g.StartPosition <= l.Number
                                AND g.EndPosition >= l.Number
                    ORDER BY l.Number
                    FOR XML PATH(''), TYPE
                ).value('.', 'VARCHAR(50)');

Finally, I would enclose the logic in an inline table valued function:

CREATE FUNCTION dbo.YourFunctionName (@s VARCHAR(50))
RETURNS TABLE
AS
RETURN
(   WITH Numbers (Number) AS -- 50 ROWS
    (   SELECT  TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
        FROM    (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
                CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
    ), Letters AS
    (   SELECT  Number,
                Letter = SUBSTRING(@s, Number, 1)
        FROM    Numbers
    ), Grouping AS
    (   SELECT  Number,
                GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
        FROM    Letters 
        WHERE   Letter = '"'
    ), Groups AS
    (   SELECT  StartPosition = MIN(Number),
                EndPosition = MAX(Number),
                Letter = ' '
        FROM    Grouping
        GROUP BY GroupID
        HAVING COUNT(*) = 2
    )
    SELECT  String = (  SELECT  CASE WHEN l.Letter = g.Letter THEN '_' ELSE l.Letter END
                        FROM    Letters AS l
                                LEFT JOIN Groups AS g
                                    ON g.StartPosition <= l.Number
                                    AND g.EndPosition >= l.Number
                        ORDER BY l.Number
                        FOR XML PATH(''), TYPE
                    ).value('.', 'VARCHAR(50)')
);

Which can the be called as:

SELECT  t.s,
        Replaced = (SELECT String FROM dbo.YourFunctionName (t.s))
FROM    (VALUES 
            ('"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc'),
            ('"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc"'),
            ('"AAAA_BBBB" CCCC "DDDD EEEE" "FFFF GGGG HHHH cccc'),
            ('"AAAA_BBBB" CCCC "DDDD EEEE" "FFFF GGGG HHHH cccc"')
        ) AS t (s);

This may seem significantly more complicated, but hopefully I have explained the logic properly, and the big advantage with using an inline table valued function is that the definition is expanded out into the main query, unlike scalar udfs which have a cursor like execution.

Upvotes: 1

Hardik Parmar
Hardik Parmar

Reputation: 1051

Please refer this code. I think this code will give some hint. This code is not the tested code. I am posting answer Because at least you get some hint to find your solution.

DECLARE @In VARCHAR(50) = '"AAAA BBBB"  CCCC DDDD EEEE "FFFF GGGG HHHH IIIII'

DECLARE @Quote SMALLINT = -1, @Index INT = 1, @Char CHAR(1) ,@New_Char CHAR(100)
WHILE @Index <= LEN(@In) BEGIN
    SET @Char = SUBSTRING(@In, @Index, 1)
    IF @Char = '"'
        SET @Quote = @Quote * -1
        SET @New_Char = SUBSTRING(@In, @Index + 1,213)
    IF @Char = ' ' AND @Quote > 0 AND @New_Char like '%"%'
        SET @In = STUFF(@In, @Index, 1, '_')
    SET @Index = @Index + 1
END

PRINT @In

Upvotes: 2

Related Questions