Reputation: 18304
On a table with over 10 million rows, which one of these UPDATE
statements likely to faster, or are they much of a muchness?
UPDATE table SET col1 = col2 WHERE col1 IS NULL;
UPDATE table SET col1 = col2 WHERE col1 IS NULL AND col2 IS NOT NULL;
Neither col1
nor col2
are keys or indexed. They are actually datetime
fields, if that changes anything.
Note: I'm fairly sure that the queries achieve the same result:
Query 1. will update col1
every time, including setting it to NULL
if col2 is NULL
.
Query 2. will update col1
only if col2
has a value, thereby leaving col1
as NULL
if col2 is NULL
.
They both change the data in the same way. One sets col1
to NULL
when col2 is NULL
, the other leaves col1
as NULL
when col2 is NULL
, but in all cases col1
was NULL
to begin with.
Upvotes: 0
Views: 1555
Reputation: 18304
In response to the various comments & answers about measuring rather than guessing, and also to the comments regarding the count of rows where col2 is NULL
vs col2
having a value of some kind, I ran 4 tests on a test table:
| Query 1 | Query 2
col2 almost all NULL | 5 min 27.96 sec | 4 min 20.38 sec
col2 almost all has values | 34 min 31.64 sec | 32 min 31.54 sec
Conclusion: Query 2 is marginally faster in both cases.
Upvotes: 0
Reputation: 5825
In my experience the second format is usually preferable, but of course it will depend on your data. It is important to note that indexes on the columns in question may not include null values, depending on the database provider and the way that the indexes are defined. Even if indexes are defined, they may not improve performance, and depending on your statistics they may not be used at all.
Upvotes: 0
Reputation: 5006
Performance is better measured than guessed. Yet here is my guess: It heavily depends on your usage of indeces, and their cardinality.
That means: if col2 has an index with thousands of different values and null beeing one of them, it´ll be faster than query 1. If there is no query and col2 almost always is null, it will be slower.
Best performance will be achieved if you have a two-column-index spanning both cols, and use both cols in the where-part.
Upvotes: 1
Reputation: 3956
It actually depends on different factors.
UPDATE table SET col1 = col2;
Will do a full table scan and modification in any case.
These queries
UPDATE table SET col1 = col2 WHERE col1 IS NULL;
UPDATE table SET col1 = col2 WHERE col1 IS NULL AND col2 IS NOT NULL;
will result in faster operation, if col1 and col2 are indexed.
You can get information about what is done in which case if you execute
EXPLAIN UPDATE table SET col1 = col2;
EXPLAIN UPDATE table SET col1 = col2 WHERE col1 IS NULL;
EXPLAIN UPDATE table SET col1 = col2 WHERE col1 IS NULL AND col2 IS NOT NULL;
Upvotes: 0