Rahul Misra
Rahul Misra

Reputation: 669

SQL Server - Table variable vs temporary table with a select statement with union

I need to create some concatenated columns from a complex select query which is a union of three selects

SELECT C1,C2, ... FROM Source1
UNION
SELECT C1,C2,... FROM Source2
UNION
SELECT C1,C2 from Source3

Now, instead of repeating the generation logic of my new column in all the three select statements, I thought of using a table variable to temporarily store the union results and add my column in a select from the table variable. So something like

DECLARE @tv TABLE
(C1 varchar(max),
C2 varchar(max)
.
.
.)

INSERT INTO @tv
SELECT C1,C2, ... FROM Source1
UNION
SELECT C1,C2,... FROM Source2
UNION
SELECT C1,C2 from Source3

SELECT 
C1,
C2,
CASE WHEN ...
ELSE ''
END CN
FROM @tv

I have read about performance considerations to be aware of when using table variables. The query above may sometimes generate a few thousand rows but on most occasions will generate a few hundred. Is it better to switch to temporary variable in this case? Will the UNION SELECT statements still be able to run in parallel?

Upvotes: 2

Views: 1766

Answers (2)

Dan Field
Dan Field

Reputation: 21641

Another option would be using a CTE, and the way you're describing this, this is pretty much what CTEs are for:

;WITH cte as
(
    SELECT C1,C2, ... FROM Source1
    UNION ALL
    SELECT C1,C2,... FROM Source2
    UNION ALL
    SELECT C1,C2 from Source3
) 
SELECT cte.*,
       (CASE WHEN ...
        ELSE ''
        END) as CN
FROM cte;

But a few things to consider around CTE vs table var vs temp table:

(tl;dr: CTEs are reusable within a single query, table variables and temp tables are reusable within many queries and have some different memory/indexing capabilities.)

  • CTEs are reusable within your query and can end up being a little cleaner than subqueries, but some people are just used to subqueries. In some cases, I've seen the optimizer do a better job on CTEs than with subqueries, but it usually pans out.
  • Table variables may be stored in memory or on disk; CTEs basically work this way too, but the big difference is that a table variable can have a primary key and unique keys added to it. You're not using that here, so no big deal. Another big difference is that table variables can be declared as reusable types - if this is a format that you use in a lot of places and want to maintain the structure of it, that's helpful. It's also helpful if you're using SQL 2014 or 2016, where table variables can be declared as OLTP in Memory Tables. Temp tables can't do that, and neither can CTEs - this is basically a very optimized version of the table for high contention scenarios that are better handled in memory with latchless structures rather than the normal SQL latching process. Table variables can also be passed as parameters to other stored procedures.
  • Temp tables can have any number of indexes added to them. They also can be declared as global temp tables. They're going to reside on disk, and they'll add work to your tempdb. They can't be declared as in memory OLTP tables. They can't be passed as parameters, but they can be shared between procedures (whether as global temp tables or as a temp table accessible within the same batch)
  • Both temp and table variables can have other DML statements run on them after the initial INSERT - if you need to update/delete/insert more data in them that can work. CTEs can't do that (if you run DML on a CTE, it will insert/update/delete the underlying table(s)).

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1269873

Putting the column generation logic in one place is a good idea. I would just do this with a subquery:

SELECT s.*,
       (CASE WHEN ...
        ELSE ''
        END) as CN
FROM (SELECT C1,C2, ... FROM Source1
      UNION ALL
      SELECT C1,C2,... FROM Source2
      UNION ALL
      SELECT C1,C2 from Source3
     ) s;

Note: Use UNION ALL instead of UNION, unless you specifically want to incur the overhead of removing duplicates.

Upvotes: 5

Related Questions