JohnLBevan
JohnLBevan

Reputation: 24470

Updating a subset of data through a CTE

Question

I've just come across the concept of using update statements on CTEs. This seems a great approach, but I've not seen it used before, and the context in which I was introduced to it (i.e. uncovered it in some badly written code) suggests the author didn't know what they were doing. Is anyone aware of any reason not to perform updates on CTEs / any considerations which should be made when doing so (assuming the CTE gives some benefit; such as to allow you to update an arbitrary subset of data).

Full Info

I recently found some horrendous code in our production environment where someone had clearly been experimenting on ways to update a single row or data. I've tidied up the layout to make it readable, but have left the original logic as is.

CREATE PROCEDURE [dbo].[getandupdateWorkOrder] 
    -- Add the parameters for the stored procedure here
    -- @p1 xml Output

AS
BEGIN
     WITH T AS
     (
        SELECT XMLDoc
        , Retrieved 
        from [Transcode].[dbo].[WorkOrder] 
        where WorkOrderId IN 
        (
            SELECT TOP 1 WorkOrderId 
            FROM 
            (
                SELECT DISTINCT(WorkOrderId)
                , Retrieved
                FROM [Transcode].[dbo].[WorkOrder] 
                WHERE Retrieved = 0 
            ) as c
        )
        AND Retrieved = 0
    )

    UPDATE T
    SET Retrieved = 1
    Output inserted.XMLDoc 

END

I can easily update this to the below without affecting the logic:

CREATE PROCEDURE [dbo].[GetAndUpdateWorkOrder] 
AS
BEGIN
     WITH T AS
     (
        SELECT top 1 XMLDoc
        , Retrieved 
        from [WorkOrder] 
        where Retrieved = 0
    )

    UPDATE T
    SET Retrieved = 1
    Output inserted.XMLDoc 

END

However the code also introduced me to a new concept; that you could update CTEs / see those updates in the underlying tables (I'd previously assumed that CTEs were read only in-memory copies of the data selected from the original table, and thus not possible to amend). Had I not seen the original code, but needed something which behaved like this I'd have written it as follows:

CREATE PROCEDURE [dbo].[GetAndUpdateWorkOrder] 
AS
BEGIN

    UPDATE [WorkOrder] 
    SET Retrieved = 1
    Output inserted.XMLDoc 
    where Id in 
    (
        select top 1 Id
        from [WorkOrder]
        where Retrieved = 0
        --order by Id --I'd have included this too; but not including here to ensure my rewrite is *exactly* the same as the original in terms of functionality, including the unpredictable part (the bonus of not including this is a performance benefit; though that's negligible given the data in this table)
    )

END

The code which performs the update via the CTE looks much cleaner (i.e. you don't even need to rely on a unique id for this to work). However because the rest of the original code is badly written I'm apprehensive about this new technique, so want to see what the experts say about this approach before adding it to my arsenal.

Upvotes: 0

Views: 266

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270401

Updating CTEs is fine. There are limitations on the subqueries that you can use (such as no aggregations).

However, you have a misconception about CTEs in SQL Server. They do not create in-memory tables. Instead, they operate more like views, where the code is included in the query. The overall query is then optimized. Note: this behavior differs from other databases and, there is no way to override this, even with a hint.

This is an important distinction. If you have a complex CTE and use it more than once, then it will typically execute for each reference in the overall query.

Upvotes: 2

Code Different
Code Different

Reputation: 93181

Updating through CTEs are fine. It's especially handy when you have to deal with window functions. For example, you can use this query to give the top 10 performing employees in each department a 10% raise:

WITH TopPerformers AS
(
    SELECT  DepartmentID, EmployeeID, Salary,
            RANK() OVER (PARTITION BY DepartmentID ORDER BY PerformanceScore DESC) AS EmployeeRank
    FROM    Employees       
)

UPDATE TopPerformers
    SET     Salary = Salary * 1.1
    WHERE   EmployeeRank <= 10

(I'm ignoring the fact that there can be more than 10 employees per department in case many have the same score, but that's beyond the point here.)

Nice clean and easy to understand. I see CTE as a temporary view so I tend to follow what Microsoft says about updating views. See the Updatable Views section on this page.

Upvotes: 2

Related Questions