Anish Patel
Anish Patel

Reputation: 767

Updating values to a table based on another table slow

I wanted to know why my select statement is orders of magnitude faster than my update statement when updating values from one table to another as opposed to selecting values from one table to another. The example below illustrates my problem, I can run my select statement very quickly

select customers.name, countries.country
from customers
inner join countries on customers.countrycode = countries.countrycode

However the update statement that I've written (which I'm fairly sure is the same thing) takes orders of magnitude longer to complete.

update customers
set customers.country = countries.country
from customers
inner join countries on customers.countrycode = countries.countrycode

Any suggestions?

UPDATE:

Below is the plan for the SELECT statement

  |--Hash Match(Inner Join, HASH:([countries].[countrycode])=([Testing].[dbo].[customers].[countrycode]), RESIDUAL:(@countries.[countrycode] as [countries].[countrycode]=[Testing].[dbo].[customers].[countrycode]))
       |--Table Scan(OBJECT:(@countries AS [countries]))
       |--Table Scan(OBJECT:([Testing].[dbo].[customers]))  

Below is the plan for the UPDATE statement

  |--Table Update(OBJECT:([Testing].[dbo].[Customers]), SET:([Testing].[dbo].[Customers].[Country] = @countries.[country] as [countries].[country]))
       |--Top(ROWCOUNT est 0)
            |--Stream Aggregate(GROUP BY:([Bmk1000]) DEFINE:([countries].[country]=ANY(@countries.[country] as [countries].[country])))
                 |--Nested Loops(Inner Join, WHERE:(@countries.[countrycode] as [countries].[countrycode]=[Testing].[dbo].[Customers].[countrycode]))
                      |--Table Scan(OBJECT:([Testing].[dbo].[Customers]))
                      |--Table Scan(OBJECT:(@countries AS [countries]))

Upvotes: 1

Views: 693

Answers (3)

Quassnoi
Quassnoi

Reputation: 425491

First, UPDATE alters the table. This means heavy I/O on transaction log and on the datafiles themselves if a checkpoint occurs during the update.

A DML operation is no way faster than a query.

Second, how do you measure the speed? SQL Server can return first records much sooner (and even unproportionally sooner) than all records, depending on the query plan used and how is the cache populated.

Update:

From your plan I can see that no UNIQUE key is defined for countries.countrycode.

SQL Server has to find a single value for each countrycode to assign to customers (even there is a single value in fact) and does a Stream Aggregate for this.

Create a PRIMARY KEY on countries.countrycode.

Upvotes: 3

Gabriel Guimarães
Gabriel Guimarães

Reputation: 2744

Select has (if data is not already on memory) I/O out of Disk.

Update has Read I/O, Write to Log I/O and Write to memory I/O. So I guess it should take longer anyway.

But you should try to create a clustered index if you can choose a slim clustered index key, that would probably speed things up.

Upvotes: 0

JNK
JNK

Reputation: 65177

Indexes

If you have multiple indexes on customers, every updated row updates every index. This is a resource-intensive process. Indexes speed up selects but slow down updates/inserts (as a rule).

Upvotes: 1

Related Questions