Reputation: 65
I'am currently developing a program and i want to write a function which is accept a value in following format
"AAAA BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH"
I want to replace the spaces in above with "_" and need a output as showed following format (please note that this should happen only for string which is inside double quotes)
"AAAA_BBBB" CCCC DDDD EEEE "FFFF_GGGG_HHHH"
So I got the correct answer to this in my previous question
DECLARE @In VARCHAR(50) = '"AAAA BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH"'
DECLARE @Quote SMALLINT = -1, @Index INT = 1, @Char CHAR(1)
WHILE @Index <= LEN(@In) BEGIN
SET @Char = SUBSTRING(@In, @Index, 1)
IF @Char = '"'
SET @Quote = @Quote * -1
IF @Char = ' ' AND @Quote > 0
SET @In = STUFF(@In, @Index, 1, '_')
SET @Index = @Index + 1
END
PRINT @In
and now I want to modify this with the following requirement
with the required format you can note that, underscore should be replace with the spaces which is only inside the enclosed double quotes, but now i get a wrong output if the user input was as following
"AAAA BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc
it gives output as
"AAAA_BBBB" CCCC DDDD EEEE "FFFF_GGGG_HHHH_cccc
but it must give output only for the first enclosed double quote string,because other one is not closed, expected output should be as follows
"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc
Upvotes: 2
Views: 281
Reputation: 13209
Here are the modifications I would make
Code
DECLARE @In VARCHAR(50) = '"AAAA BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH IIIII"'
DECLARE @Quote SMALLINT = 0, @Index INT = 1, @Char CHAR(1)
WHILE @Quote < 2 AND @Index <= LEN(@In) BEGIN
SET @Char = SUBSTRING(@In, @Index, 1)
IF @Char = '"'
SET @Quote = @Quote + 1
IF @Char = ' ' AND @Quote = 1
SET @In = STUFF(@In, @Index, 1, '_')
SET @Index = @Index + 1
END
PRINT @In
Output
"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH IIIII"
Upvotes: 2
Reputation: 69819
I'd approach this differently to your last answer, and avoid a loop. The first step is to generate a list of 50 sequential numbers (this number should match the length of your varchar). This will allow you to split the varchar into rows which are much easier to deal with:
DECLARE @s VARCHAR(50) = '"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc';
WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
)
SELECT Number,
Letter = SUBSTRING(@s, Number, 1)
FROM Numbers;
This gives something like:
Number | Letter
-------+--------
1 | "
2 | A
3 | A
In a similar fashion you can use this to get 'Groups' of quotation marks:
DECLARE @s VARCHAR(50) = '"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc';
WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
)
SELECT Number,
GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM Numbers
WHERE SUBSTRING(@s, Number, 1) = '"'
Which gives:
Number | GroupID
-------+--------
1 | 1
11 | 1
29 | 2
You can the group by GroupID
to get the start and end position of your groups of quotation marks:
DECLARE @s VARCHAR(50) = '"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc';
WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Grouping AS
( SELECT Number,
GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM Numbers
WHERE SUBSTRING(@s, Number, 1) = '"'
)
SELECT StartPosition = MIN(Number),
EndPosition = MAX(Number)
FROM Grouping
GROUP BY GroupID
HAVING COUNT(*) = 2 -- ONLY INCLUDE WHERE THERE IS A START AND AN END
Now you can take this back to your original split, and replace any spaces in this range with an underscore:
DECLARE @s VARCHAR(50) = '"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc';
WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Letters AS
( SELECT Number,
Letter = SUBSTRING(@s, Number, 1)
FROM Numbers
), Grouping AS
( SELECT Number,
GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM Letters
WHERE Letter = '"'
), Groups AS
( SELECT StartPosition = MIN(Number),
EndPosition = MAX(Number)
FROM Grouping
GROUP BY GroupID
HAVING COUNT(*) = 2
)
SELECT l.Number,
Letter = CASE WHEN g.StartPosition IS NOT NULL AND l.Letter = ' ' THEN '_' ELSE l.Letter END
FROM Letters AS l
LEFT JOIN Groups AS g
ON g.StartPosition <= l.Number
AND g.EndPosition >= l.Number;
Which gives:
Number | Letter
-------+--------
1 | "
2 | A
3 | A
4 | A
5 | A
6 | _
7 | B
8 | B
9 | B
10 | B
11 | "
12 |
Then you can rebuild your original string using FOR XML concatenation:
DECLARE @s VARCHAR(50) = '"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc';
WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Letters AS
( SELECT Number,
Letter = SUBSTRING(@s, Number, 1)
FROM Numbers
), Grouping AS
( SELECT Number,
GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM Letters
WHERE Letter = '"'
), Groups AS
( SELECT StartPosition = MIN(Number),
EndPosition = MAX(Number)
FROM Grouping
GROUP BY GroupID
HAVING COUNT(*) = 2
)
SELECT String = ( SELECT CASE WHEN g.StartPosition IS NOT NULL AND l.Letter = ' ' THEN '_' ELSE l.Letter END
FROM Letters AS l
LEFT JOIN Groups AS g
ON g.StartPosition <= l.Number
AND g.EndPosition >= l.Number
ORDER BY l.Number
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(50)');
Finally, I would enclose the logic in an inline table valued function:
CREATE FUNCTION dbo.YourFunctionName (@s VARCHAR(50))
RETURNS TABLE
AS
RETURN
( WITH Numbers (Number) AS -- 50 ROWS
( SELECT TOP (LEN(@s)) ROW_NUMBER() OVER(ORDER BY N1.N)
FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) N1 (N)
CROSS JOIN (VALUES (1), (1), (1), (1), (1)) N2 (N)
), Letters AS
( SELECT Number,
Letter = SUBSTRING(@s, Number, 1)
FROM Numbers
), Grouping AS
( SELECT Number,
GroupID = CEILING(ROW_NUMBER() OVER(ORDER BY Number) / 2.0)
FROM Letters
WHERE Letter = '"'
), Groups AS
( SELECT StartPosition = MIN(Number),
EndPosition = MAX(Number),
Letter = ' '
FROM Grouping
GROUP BY GroupID
HAVING COUNT(*) = 2
)
SELECT String = ( SELECT CASE WHEN l.Letter = g.Letter THEN '_' ELSE l.Letter END
FROM Letters AS l
LEFT JOIN Groups AS g
ON g.StartPosition <= l.Number
AND g.EndPosition >= l.Number
ORDER BY l.Number
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(50)')
);
Which can the be called as:
SELECT t.s,
Replaced = (SELECT String FROM dbo.YourFunctionName (t.s))
FROM (VALUES
('"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc'),
('"AAAA_BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH cccc"'),
('"AAAA_BBBB" CCCC "DDDD EEEE" "FFFF GGGG HHHH cccc'),
('"AAAA_BBBB" CCCC "DDDD EEEE" "FFFF GGGG HHHH cccc"')
) AS t (s);
This may seem significantly more complicated, but hopefully I have explained the logic properly, and the big advantage with using an inline table valued function is that the definition is expanded out into the main query, unlike scalar udfs which have a cursor like execution.
Upvotes: 1
Reputation: 1051
Please refer this code. I think this code will give some hint. This code is not the tested code. I am posting answer Because at least you get some hint to find your solution.
DECLARE @In VARCHAR(50) = '"AAAA BBBB" CCCC DDDD EEEE "FFFF GGGG HHHH IIIII'
DECLARE @Quote SMALLINT = -1, @Index INT = 1, @Char CHAR(1) ,@New_Char CHAR(100)
WHILE @Index <= LEN(@In) BEGIN
SET @Char = SUBSTRING(@In, @Index, 1)
IF @Char = '"'
SET @Quote = @Quote * -1
SET @New_Char = SUBSTRING(@In, @Index + 1,213)
IF @Char = ' ' AND @Quote > 0 AND @New_Char like '%"%'
SET @In = STUFF(@In, @Index, 1, '_')
SET @Index = @Index + 1
END
PRINT @In
Upvotes: 2