bgusach
bgusach

Reputation: 15153

SQL efficiency: update few columns vs all of them

I have to indentify if a row has changed and update it if so. In my programme I have a dictionary with the new values, and a second dictionary with the previous info of the row in the DB.

I have on mind two approaches:

Option A Loop over the dictionary comparing to the row values, and if any differs, break and update ALL the fields. I.e. the UPDATE statement would contain all the fields and their values.

The code (python in this case, but very easy to understand) would look like this:

for key, value in incoming_dictionary.iteritems():
    if old_dictionary[key] != value:
         update_row(incoming_dictionary)
         break

Which would produce an SQL statement like this:

UPDATE mytable SET field1=... field2=... fieldN=... WHERE = ...

Option B Loop over the dictionary comparing the values, and those that differ have to be stored in another dictionary, and at the end of the loop, update the row, restricting the UPDATE statement to those values that actually are different.

changed_fields = {}
for key, value in incoming_dictionary.iteritems():
    if old_dictionary[key] != value:
         changed_fields[key] = value

update_fields(changed_fields)

Which would produce a SQL like:

UPDATE mytable SET field1=... WHERE = ...

I would say the second one should be more efficient on the SQL level, but I am not quite sure. And it has a little bit of overhead on the programming level.

I am working with MS SQL Server, but my programme addresses different RBDMS. Thanks.

Upvotes: 1

Views: 505

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269693

There is no general answer to your question. Different databases perform updates differently. For instance, MySQL does not update records where there are no changes (the comparison is part of the update operation).

As a rule, the expense of an update is based on fetching the record, logging the update, and modifying indexes -- these are operations that go beyond changes to data in memory. As a guideline, it is more important to have an index that will fetch the appropriate record(s) quickly, rather than optimize the filling of the database. The latter is generally an in-memory operation.

These guidelines may not work in all cases. The presence of large fields that span multiple pages will affect performance, for instance. Updates to fields with indexes have their own performance implication.

Upvotes: 2

Related Questions