cl0rkster
cl0rkster

Reputation: 369

Why does the EXCEPT clause trim whitespace at the end of text?

I read through the documentation for the SqlServer EXCEPT operator and I see no mention of explicit trimming of white space at the end of a string. However, when running:

SELECT 'Test'
EXCEPT
SELECT 'Test '

no results are returned. Can anyone explain this behavior or how to avoid it when using EXCEPT?

Upvotes: 0

Views: 241

Answers (1)

Ghost
Ghost

Reputation: 2226

ANSI SQL-92 requires strings to be the same length before comparing and the pad character is a space.

See https://support.microsoft.com/en-us/help/316626/inf-how-sql-server-compares-strings-with-trailing-spaces for more information

In the ANSI standard (accessed here section 8.2 )

3) The comparison of two character strings is determined as follows:

        a) If the length in characters of X is not equal to the length
          in characters of Y, then the shorter string is effectively
          replaced, for the purposes of comparison, with a copy of
          itself that has been extended to the length of the longer
          string by concatenation on the right of one or more pad char-
          acters, where the pad character is chosen based on CS. If
          CS has the NO PAD attribute, then the pad character is an
          implementation-dependent character different from any char-
          acter in the character set of X and Y that collates less
          than any string under CS. Otherwise, the pad character is a
          <space>.

        b) The result of the comparison of X and Y is given by the col-
          lating sequence CS.

        c) Depending on the collating sequence, two strings may com-
          pare as equal even if they are of different lengths or con-
          tain different sequences of characters. When the operations
          MAX, MIN, DISTINCT, references to a grouping column, and the
          UNION, EXCEPT, and INTERSECT operators refer to character
          strings, the specific value selected by these operations from
          a set of such equal values is implementation-dependent.

If this behaviour must be avoided, you can reverse the columns as part of your EXCEPT:

SELECT 'TEST', REVERSE('TEST')
EXCEPT
SELECT 'TEST ', REVERSE('TEST ')

which gives the expected result, though is quite annoying especially if you're dealing with multiple columns.

The alternative would be to find a collating sequence with an alternate pad character or a no pad option set, though this seems to not exist in t-sql after a quick google.

Alternatively, you could terminate each column with a character and then substring it out in the end:

SELECT SUBSTRING(col,1,LEN(col) -1) FROM 
(
    SELECT 'TEST' + '^' as col
    EXCEPT 
    SELECT 'TEST ' + '^'
) results

Upvotes: 2

Related Questions