SchmitzIT
SchmitzIT

Reputation: 9572

TSQL: seemingly empty string

As part of an Azure data warehousing project (Running on Azure SQL Server), I'm implementing some business logic to append a new field to a dimension. I found that one of my fields for a particular row returns some data that I'm having trouble understanding.

The column seems to contain a blank string (I.e. ''), but it actually doesn't get caught when using a REPLACE function.

Here's my code:

SELECT
    <MyColumn> as MyColumn
    , LEN (<MyColumn>)
    , REPLACE(<MyColumn> , '', 'foo')
    , ISNULL(<MyColumn> , 'bar')
    , UNICODE(<MyColumn>)
    , ASCII(<MyColumn>)
FROM 
    <MyTable>

And the output in SSMS:

SSMS Output

As can be seen, the field does not contain a NULL value, nor an empty string, and neither ASCII or UNICODE functions return a valid value. I also tried checking for a space character (' '), but seeing the LEN() of the field is 0, that didn't work.

I ended up sidestepping the issue by using a SELECT CASE WHEN LEN(<MyColumn>) = 0 THEN 'N/A' END, but am curious what value is actually in here. Any ideas?

Thanks!

Upvotes: 0

Views: 223

Answers (3)

Thom A
Thom A

Reputation: 95827

REPLACE(<MyColumn> , '', 'foo') will do nothing. REPLACE replacesthe characters within the second parameter with those in the third. Your 2nd parameter is a 0 length string, so contains no characters to replace.

LEN doesn't include trailing spaces, so a string with the value ' ' will return 0 for LEN. If you want to include trailing spaces, use DATALENGTH. IF DATALENGTH(<MyColumn>) returns a value greater than 0 it contains spaces, if not, then it has the value ''. Using NULLIF(<MyColumn>,'') will return NULL too for any string that is zero lengtt, or only contains spaces.

The fact, however, that both UNICODE and ASCII return NULL suggests that column does have the value '', as a zero length string cannot value an ASCII/UNICODE value. This can be evidenced with the below sample data:

SELECT V.YourString,
       V.StringDescription,
       LEN(V.YourString) AS StringLEN, --Will return 0 for all rows
       DATALENGTH(V.YourString) AS StringDATALENGTH, --Will return varying results,
       REPLACE(V.YourString, '', 'foo') AS StringREPLACEBlank, --Will do nothing, there are no characters to replace
       REPLACE(V.YourString, ' ', 'foo') AS StringREPLACEWhiteSpace, --Varying lengths of foo
       ASCII(V.YourString) AS StringASCII, --Will return NULL for Empty String
       NULLIF(V.YourString,'') AS StringNULLIF --Will return NULL for Empty String
FROM (VALUES('','Empty String'),
            (' ','One White Space'),
            ('   ','Multiple White Space'))V(YourString, StringDescription)

which returns the below dataset:

YourString StringDescription    StringLEN   StringDATALENGTH StringREPLACEBlank StringREPLACEWhiteSpace StringASCII StringNULLIF
---------- -------------------- ----------- ---------------- ------------------ ----------------------- ----------- ------------
           Empty String         0           0                                                           NULL        NULL
           One White Space      0           1                                   foo                     32          NULL
           Multiple White Space 0           3                                   foofoofoo               32          NULL

The reason NULLIF(Column,'') returns NULL for a both zero length strings and strings that only contain spaces is because it is a short hand CASE expression. NULLIF(Column,'') is parsed as CASE WHEN Column = '' THEN NULL ELSE Column END. When comparing strings with the = operator, a string with trailing spaces compared to one without will be seen as the same value. For example ' ' = '' would be true. This means that if Column had a value of ' ', then Column = '' would also be True, and NULL would be returned by NULLIF.

Upvotes: 2

Jakob Busk S&#248;rensen
Jakob Busk S&#248;rensen

Reputation: 6091

As mentioned, I don't think you can replace nothing with something.

However, there is a lot of different 'invisible' characters, other than space. You can try to copy-paste it to a website which can identify unicode characters (like this one).

Upvotes: 0

Neophear
Neophear

Reputation: 542

REPLACE searches for a substring in the original string. As you are searching for nothing, nothing is replaced.

More info can be found here: https://www.w3schools.com/sql/func_sqlserver_replace.asp

Upvotes: 2

Related Questions