Firealem Erko
Firealem Erko

Reputation: 357

Online Schema Modification

I have a table called BigData on SQL Server. BigData is modified by an app continuously and can't be taken offline even for a second. Another requirement is that the table can't be locked exclusively by some other process other than the app accessing it. The question is; how can I change one of the columns of BigData from Bigint to int? any suggestions?

Upvotes: 3

Views: 505

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522007

I also recently had a question very similar to this during a PayPal interview. This situation is common with a bank or e-commerce company which has a live database which is constantly in use.

The answer I gave was to:

  1. Create a new table based on the BigData table which looks identical except that the column has the new type.
  2. Create a trigger which will fire every time an update or insert is made to the old table. The trigger will update/insert appropriately to the new table
  3. Start copying all records from the old table to the new table which are older than the time the trigger goes live. This step will be overlapping as the trigger covers the new records.
  4. Once all the old records have successfully been copied to the new table, we can expect that the new table now has all legacy records, plus any new records which were inserted/updated since we started the migration process.

After doing this, the new and old tables should have identical content, and both should be in sync with the running application. Now, the old table can be dropped, and the new table can be renamed to the old one. It may be necessary for a brief outage (say a few minutes) to achieve this, but this should be acceptable.

Upvotes: 5

Kaveh Hadjari
Kaveh Hadjari

Reputation: 237

A solution to avoid locking issues during a possibly long running processes of copying old data is to do it batches with delays in between each batch to allow other normal usage processes to lock the table in the delay between each batch. As @TimBiegeleisen also mentioned this could be simple if there's an identity column which is also primary key since then we can calculate the offset to which rows to copy from the old table and also we could split up the batches more efficiently based on the order of the identity. It's still a doable approach even without identity, at least if there's a primary key.

Upvotes: 1

Related Questions