Reputation: 8988
I have a routine in our .NET web application that allows a user on our platform to clear their account (i.e. delete all their data). This routine runs in a stored procedure and essentially loops through the relevant data tables and clears down all the various items they have created.
The stored procedure looks something like this.
ALTER procedure [dbo].[spDeleteAccountData](
@accountNumber varchar(30) )
AS
BEGIN
SET ANSI_NULLS ON ;
SET NOCOUNT ON;
BEGIN TRAN
BEGIN TRY
DELETE FROM myDataTable1 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable2 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable3 WHERE accountNumber = @accountNumber
//Etc.........
END TRY
BEGIN CATCH
//CATCH ERROR
END CATCH
IF @@TRANCOUNT > 0
COMMIT TRANSACTION;
SET ANSI_NULLS OFF;
SET NOCOUNT OFF;
END
The problem is that in some cases we can have over 10,000 rows on a table and the procedure can take up to 3-5 minutes. During this period all the other connections on the database get throttled causing time-out errors like the one below:
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Are there any general changes I can make to improve performance? I appreciate there are many unknowns related to the design of our database schema, but general best practice advice would be welcomed! I thought about scheduling this task to run during the early hours to minimise impact, but this is far from Ideal as the user wouldn't be able to regain access to their account until this task had been completed.
Additional Information:
Edit: 16:52 GMT
The delete proc affects around 20 tables. The largest one has approx 5 million records. The others have no more the 200,000, with some containing only 1000-2000 records.
Upvotes: 6
Views: 5902
Reputation: 4640
One way you might want to try is this:
Example:
DECLARE @DeletedRowsCount INT = 1, @BatchSize INT = 300;
WHILE (@DeletedRowsCount> 0) BEGIN
BEGIN TRANSACTION
DELETE TOP (@BatchSize) dbo.Table
FROM dbo.Table
WHERE Id = @PortalId;
SET @DeletedRowsCount = @@ROWCOUNT;
COMMIT;
WAITFOR DELAY '00:00:05';
END
I guess you can do the same without a SP as well. In fact, it might be better like that.
Upvotes: 1
Reputation: 5999
It could be worth switching the database into Read Committed Snapshot mode. This will have a performance impact, how much depends on your application.
In Read Committed Snapshot mode, writers and readers no longer block each other, although writers still block writers. You don't say what sort of activity on the table is getting prevented by the delete, so it's a little hard to say if this will help?
http://msdn.microsoft.com/en-us/library/ms188277(v=sql.105).aspx
Having said that, 3-5 minutes for a deletion on tables with ~10k rows seems absurdly slow. You mention foreign keys, are the foreign keys indexed? If not, deletion can cause table scans on the other end to make sure you're not breaking RI, so maybe check that first? What does SQL Server Profiler say for reads/writes for these deletion queries?
Upvotes: 0
Reputation: 27862
SqlCommand.CommandTimeout is the short answer. Increase its value.
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.commandtimeout.aspx
Note, the Connection Timeout is not the same thing as the CommandTimeout.
...
Do you have an index on "accountNumber" on each table?
You could have a clustered key on the surrogate-key of the table, but not the "accountNumber".
...
Basically, you're gonna have to look at the execution plan (or post the execution plan) here.
But here is some "starter code" for trying an index on that column(s).
if exists (select * from dbo.sysindexes where name = N'IX_myDataTable1_accountNumber' and id = object_id(N'[dbo].[myDataTable1]'))
DROP INDEX [dbo].[myDataTable1].[IX_myDataTable1_accountNumber]
GO
CREATE INDEX [IX_myDataTable1_accountNumber] ON [dbo].[myDataTable1]([accountNumber])
GO
Upvotes: 0
Reputation: 7267
If you have an index on the accountNumber field then I guess the long time for deletion is due to locks (generated by other processes) or to foreign keys affected by the respective tables.
Off course purists will blame me for the latter but I had been using this a lot of times when need arises.
Upvotes: 1
Reputation: 754518
Do you have an index on accountNumber
in all tables ?
Seeing that you delete using a WHERE
clause by that column, this might help.
Another option (and probably even better solution) would be to schedule deletion operations at night, e.g. when user selects to delete his account, you're only setting a flag, and a delete job runs at night actually deleting those accounts flagged for deletion.
Upvotes: 4