Reputation: 5008
Itzik Ben-Gan explains in his book 'Inside Microsoft® SQL Server® 2008: T-SQL' how Appoach A works, because of the undocumented behavior of SQL Server in which it performs assignment for each result record from the SELECT
.
A well respected collegue and DB guru has suggested that Approach B is guaranteed to work. His argument is based on the recursive nature of COALESCE
versus the 'values expansion' method of CAST
.
Actually, I have no idea what 'values expansion' refers to (except for the fact that it is casting one value to another) or how it applies to this problem? Perhaps he misunderstood? Yes COALESCE
is recursive in a sense, but as far as I can see it is irrelevant and the desired result is produced because of the undocumented behavior of multiple assignment.
Is he correct? Please, no "Use FOR XML PATH
instead" answers!
Approach A
DECLARE @output VARCHAR(100);
SET @output = '';
SELECT @output = @output + CAST(COL_VCHAR AS VARCHAR(10)) + ';'
FROM someTable;
Approach B
DECLARE @output VARCHAR(100);
SELECT @output = COALESCE(@output + ', ', '') + COL_VCHAR
from someTable;
Upvotes: 4
Views: 269
Reputation: 280644
I believe they both work because of the same underlying (and undocumented) behavior. I am fairly confident that if you take a case where Itzik has demonstrated that Approach A fails, that Approach B will also fail in the same way and for the same reasons, in spite of your colleague's claims. I don't see how
DECLARE @output VARCHAR(100);
SELECT @output = COALESCE(@output + ', ', '') + COL_VCHAR
Is any different from:
DECLARE @output VARCHAR(100) = '';
SELECT @output = @output + ', ' + COL_VCHAR
-- or SELECT @output += ',' + COL_VCHAR
So what exactly does COALESCE
magically introduce? SQL Server isn't going to change its plan because of something you're doing to the output, AFAIK.
I use them all the time for dynamic SQL generation and don't recall ever seeing them fail, and I realize that's not your question. Unfortunately there isn't really a way to prove it unless you know of a case that fails for either approach.
I wrote a blog post about string concatenation a few months ago. It's not completely relevant to your problem but Rob Farley made a comment that may be considered another downside to the COALESCE
approach:
https://sqlblog.org/2011/03/08/t-sql-tuesday-16-this-is-not-the-aggregate-youre-looking-for
Upvotes: 4
Reputation: 239824
Since a conforming SQL implementation could evaluate all of the output rows simultaneously, in parallel, neither is guaranteed to work. That they happen to work, today, is an artifact of the current SQL Server implementation.
Your colleague is incorrect to assert that COALESCE
somehow changes the processing model.
I.e. a conforming implementation could effectively hand each potential row in the result set (having evaluated FROM
and WHERE
) to a separate thread, that then performs whatever processing is required in the SELECT
clause (before presumably recombining results in order to assess GROUP BY
, HAVING
and ORDER BY
).
There are no standard requirements governing access to variables during such processing, so each thread could "see" the same initial value of @output
(NULL
or ''
, depending on which form you're using), perform it's own update calculation, and assign that result value to @output
- the final value of @output
might then any of the individual row results - or anything else for that matter.
Upvotes: 2
Reputation: 44346
They work differently, the first example adds an additional character to the end of the output variable, and yes they work because of the same behaviour.
The first example will work if your column had been a numeric or date also. It would however only take the first 10 characters of any value in the column. The second example would not cut off the column.
There is absolutely no recursive nature of COALESCE. COALESCE is a way of replacing null values with replacement values.
EDIT:
I should add that i also prefer the second version, because you don't get that extra semicolon, in the end of output implying that more values will follow.
Upvotes: 0
Reputation: 1401
In your example, there is a way to get different results, and that is if NULL values are introduced into the data set. e.g.:
set nocount on
declare @test table (value int)
insert into @test values (10)
insert into @test values (20)
insert into @test values (null)
insert into @test values (40)
insert into @test values (50)
DECLARE @output VARCHAR(max);
-- Approach A
SET @output = '';
SELECT @output = @output + CAST(value AS VARCHAR(10)) + ';'
FROM @test
print 'Result from A:'
print isnull(@output, '{null}')
-- Approach B
set @output = ''
SELECT @output = COALESCE(@output + ', ', '') + cast(value as varchar(10))
from @test
print 'Result from B:'
print isnull(@output, '{null}')
set nocount off
Approach A will return null, whereas Approach B will return 40, 50
Upvotes: 0