Christoph
Christoph

Reputation: 4401

Replace duplicate spaces with a single space in T-SQL

I need to ensure that a given field does not have more than one space (I am not concerned about all white space, just space) between characters.

So

'single    spaces   only'

needs to be turned into

'single spaces only'

The below will not work

select replace('single    spaces   only','  ',' ')

as it would result in

'single  spaces  only'

I would really prefer to stick with native T-SQL rather than a CLR based solution.

Thoughts?

Upvotes: 147

Views: 189540

Answers (17)

Mudit Gulgulia
Mudit Gulgulia

Reputation: 1266

The provided solutions in this Question are helpful, but if someone is looking for a solution with Amazon Redshift, they can use it.

CREATE OR REPLACE FUNCTION public.remove_extra_spaces(input character varying)
RETURNS character varying LANGUAGE plpythonu
IMMUTABLE
AS $$
    import re

    try:
        cleaned = re.sub(r'\s+', ' ', input_string).strip()
        return cleaned
    except:
        return None
$$

Upvotes: -1

cmxl
cmxl

Reputation: 775

With the "latest" SQL Server versions (Compatibility level 130) you could also use string_split and string_agg.

string_split can return an ordinal column when provided with a third argument. (https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver16#enable_ordinal). So we can preserve the order of the string_split.

Using a common table expression:

with cte(value, ordinal) as (select value, ordinal from string_split('  a b   c d     e     ', ' ', 1)  where value <> '')
select string_agg(value, ' ') within group(order by ordinal) from cte

a b c d e results in a b c d e

Edit:

Thanks @T N for the hint about garanteed ordering. Statement has been corrected.

Upvotes: 3

T N
T N

Reputation: 10101

(I am adding this answer because the other split/agg answers do not guarantee the word order in the result. This one does. That said, I still suggest looking at the accepted answer that uses a sequence of replace() functions.)

For SQL Server 2022 and later, a combination of STRING_SPLIT() and STRING_AGG() can be used.

It is critical that the ordinal value from STRING_SPLIT() be fed into the WITHIN GROUP clause of the STRING_AGG(), otherwise the word order in the result is not guaranteed. (Prior to SQL Server 2022, the STRING_SPLIT() function did not support the enable_ordinal option, so this solution cannot be used.

SELECT STRING_AGG(s.value, ' ') WITHIN GROUP(ORDER BY S.ordinal) AS Result
FROM STRING_SPLIT('single    spaces   only', ' ', 1) S
WHERE S.value <> ''

Note that the above will also remove any leading or trailing spaces from the original string. Empty strings and strings containing only spaces map to null.

See this db<>fiddle for a demo.

Upvotes: 1

Sabatino Ognibene
Sabatino Ognibene

Reputation: 171

This is the function I use:

create function [dbo].[String_Trim](@S nvarchar(max))
Returns nvarchar(max)
Begin
    /*This function removes all double spaces as well as leading/trailing spaces from a string*/
    Return ltrim(rtrim(replace(replace(replace(@S,' ','†‡'),'‡†',''),'†‡',' ')))
End

A similar answer above used '<>', rather than '†‡' but '<>' is a common expression used in coding to mean "not equal to". I'd say it's bad practice to use something so common as a placeholder to be overwritten, potentially removing something meant to be preserved.

I've never seen '†‡' used anywhere for anything. It is certainly not a coding expression. For those reasons, I would say it's safer. I may be one of the few people that uses code to edit, store, retrieve, or process code blocks but replacing '<>' would be problematic in some real-world scenarios for me. I also include the ltrim and rtrim functions which the above similar answers did not have.

Other answers provided have multiple lines, first removing 16 spaces for example. The answer I'm providing is simple, safe, and executes extremely fast.

The string_agg function looks promising but the system I'm using is still running SQL Server 2016 and string_agg was introduced publicly in SQL server 2017.

Upvotes: 0

Lekhnath Pandey
Lekhnath Pandey

Reputation: 9

Please Find below code

select trim(string_agg(value,' ')) from STRING_SPLIT('  single    spaces   only  ',' ')
where value<>' '

This worked for me.. Hope this helps...

Upvotes: 0

Arulmouzhi
Arulmouzhi

Reputation: 2254

Just Adding Another Method-

Replacing Multiple Spaces with Single Space WITHOUT Using REPLACE in SQL Server-

DECLARE @TestTable AS TABLE(input VARCHAR(MAX));

INSERT INTO @TestTable VALUES
('HAPPY         NEWYEAR     2020'),
('WELCOME       ALL     !');

SELECT
    CAST('<r><![CDATA[' + input + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)')
    AS Expected_Result
FROM @TestTable;

--OUTPUT
/*
Expected_Result
HAPPY NEWYEAR 2020
WELCOME ALL !
*/

Upvotes: 5

Eralper
Eralper

Reputation: 6622

I use FOR XML PATH solution to replace multiple spaces into single space

The idea is to replace spaces with XML tags Then split XML string into string fragments without XML tags Finally concatenating those string values by adding single space characters between two

Here is how final UDF function can be called

select dbo.ReplaceMultipleSpaces('   Sample   text  with  multiple  space     ')

Upvotes: -2

Adam Silenko
Adam Silenko

Reputation: 3108

It can be done recursively via the function:

CREATE FUNCTION dbo.RemSpaceFromStr(@str VARCHAR(MAX)) RETURNS VARCHAR(MAX) AS
BEGIN
  RETURN (CASE WHEN CHARINDEX('  ', @str) > 0 THEN
    dbo.RemSpaceFromStr(REPLACE(@str, '  ', ' ')) ELSE @str END);
END

then, for example:

SELECT dbo.RemSpaceFromStr('some   string    with         many     spaces') AS NewStr

returns:

NewStr
some string with many spaces

Or the solution based on method described by @agdk26 or @Neil Knight (but safer)
both examples return output above:

SELECT REPLACE(REPLACE(REPLACE('some   string    with         many     spaces'
  , '  ', ' ' + CHAR(7)), CHAR(7) + ' ', ''), ' ' + CHAR(7), ' ') AS NewStr 
--but it remove CHAR(7) (Bell) from string if exists...

or

SELECT REPLACE(REPLACE(REPLACE('some   string    with         many     spaces'
  , '  ', ' ' + CHAR(7) + CHAR(7)), CHAR(7) + CHAR(7) + ' ', ''), ' ' + CHAR(7) + CHAR(7), ' ') AS NewStr
--but it remove CHAR(7) + CHAR(7) from string

How it works: enter image description here

Caution:
Char/string used to replace spaces shouldn't exist on begin or end of string and stand alone.

Upvotes: 7

Neil Knight
Neil Knight

Reputation: 48567

Even tidier:

select string = replace(replace(replace(' select   single       spaces',' ','<>'),'><',''),'<>',' ')

Output:

select single spaces

Upvotes: 449

karthika harisankar
karthika harisankar

Reputation: 57

You can try this:

select Regexp_Replace('single    spaces   only','( ){2,}', ' ') from dual;

Upvotes: 0

agdk26
agdk26

Reputation: 137

This is the solution via multiple replace, which works for any strings (does not need special characters, which are not part of the string).

declare @value varchar(max)
declare @result varchar(max)
set @value = 'alpha   beta gamma  delta       xyz'

set @result = replace(replace(replace(replace(replace(replace(replace(
  @value,'a','ac'),'x','ab'),'  ',' x'),'x ',''),'x',''),'ab','x'),'ac','a')

select @result -- 'alpha beta gamma delta xyz'

Upvotes: 0

Robert Petolillo
Robert Petolillo

Reputation: 51

Here is a simple function I created for cleaning any spaces before or after, and multiple spaces within a string. It gracefully handles up to about 108 spaces in a single stretch and as many blocks as there are in the string. You can increase that by factors of 8 by adding additional lines with larger chunks of spaces if you need to. It seems to perform quickly and has not caused any problems in spite of it's generalized use in a large application.

CREATE FUNCTION [dbo].[fnReplaceMultipleSpaces] (@StrVal AS VARCHAR(4000)) 
RETURNS VARCHAR(4000) 
AS 
BEGIN

    SET @StrVal = Ltrim(@StrVal)
    SET @StrVal = Rtrim(@StrVal)

    SET @StrVal = REPLACE(@StrVal, '                ', ' ')  -- 16 spaces
    SET @StrVal = REPLACE(@StrVal, '        ', ' ')  -- 8 spaces
    SET @StrVal = REPLACE(@StrVal, '    ', ' ')  -- 4 spaces
    SET @StrVal = REPLACE(@StrVal, '  ', ' ')  -- 2 spaces
    SET @StrVal = REPLACE(@StrVal, '  ', ' ')  -- 2 spaces (for odd leftovers)

RETURN @StrVal

END

Upvotes: 5

Limey
Limey

Reputation: 2770

Found this while digging for an answer:

SELECT REPLACE(
        REPLACE(
             REPLACE(
                LTRIM(RTRIM('1 2  3   4    5     6'))
            ,'  ',' '+CHAR(7))
        ,CHAR(7)+' ','')
    ,CHAR(7),'') AS CleanString
where charindex('  ', '1 2  3   4    5     6') > 0

The full answer (with explanation) was pulled from: http://techtipsbysatish.blogspot.com/2010/08/sql-server-replace-multiple-spaces-with.html

On second look, seems to be just a slightly different version of the selected answer.

Upvotes: 0

HLGEM
HLGEM

Reputation: 96600

update mytable
set myfield = replace (myfield, '  ',  ' ')
where charindex('  ', myfield) > 0 

Replace will work on all the double spaces, no need to put in multiple replaces. This is the set-based solution.

Upvotes: 7

BradC
BradC

Reputation: 39986

If you know there won't be more than a certain number of spaces in a row, you could just nest the replace:

replace(replace(replace(replace(myText,'  ',' '),'  ',' '),'  ',' '),'  ',' ')

4 replaces should fix up to 16 consecutive spaces (16, then 8, then 4, then 2, then 1)

If it could be significantly longer, then you'd have to do something like an in-line function:

CREATE FUNCTION strip_spaces(@str varchar(8000))
RETURNS varchar(8000) AS
BEGIN 
    WHILE CHARINDEX('  ', @str) > 0 
        SET @str = REPLACE(@str, '  ', ' ')

    RETURN @str
END

Then just do

SELECT dbo.strip_spaces(myText) FROM myTable

Upvotes: 25

JohnFx
JohnFx

Reputation: 34909

This is somewhat brute force, but will work

CREATE FUNCTION stripDoubleSpaces(@prmSource varchar(max)) Returns varchar(max)
AS 
BEGIN
    WHILE (PATINDEX('%  %', @prmSource)>0)
     BEGIN
        SET @prmSource = replace(@prmSource  ,'  ',' ')
     END

    RETURN @prmSource
END

GO

-- Unit test -- 
PRINT dbo.stripDoubleSpaces('single    spaces   only')

single spaces only

Upvotes: 7

Mutation Person
Mutation Person

Reputation: 30520

This would work:

declare @test varchar(100)
set @test = 'this   is  a    test'

while charindex('  ',@test  ) > 0
begin
   set @test = replace(@test, '  ', ' ')
end

select @test

Upvotes: 36

Related Questions