DK_
DK_

Reputation: 45

SQL Server INSERT Performance (SQL, Azure SQL Database)

I have two tables, first one 'A' contains approximately 400k rows and table 'B' - 12k rows. I need to select ~350k rows from table A and INSERT to table B.

I do this operation in a stored procedure (because I need to do a bunch of other tasks):

INSERT INTO B ("fields")
    SELECT "field" 
    FROM A
    INNER JOIN @TempTable -- this join need for filtering records in table A

Table structure

Table A:

CREATE TABLE [dbo].[A]
(
    [Field1] [uniqueidentifier] NOT NULL,
    [Field2] [int] NOT NULL,
    [Field3] [uniqueidentifier] NOT NULL,
    [Field4] [nvarchar](max) NULL,
    [Field5] [bit] NOT NULL,
    [Field6] [int] NULL,
    [Field7] [tinyint] NULL,

    CONSTRAINT [PK_A] 
        PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
                WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

ALTER TABLE [dbo].[A] WITH CHECK 
    ADD CONSTRAINT [FK_...] 
        FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO

ALTER TABLE [dbo].[A] WITH CHECK 
    ADD CONSTRAINT [...] 
        FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO

ALTER TABLE [dbo].[A] WITH CHECK 
    ADD CONSTRAINT [...] 
        FOREIGN KEY([Field1]) REFERENCES [dbo].[A] ([Id])
GO

Table B:

CREATE TABLE [dbo].[B]
(
    [Field1] [uniqueidentifier] NOT NULL,
    [Field2] [int] NOT NULL,
    [Field3] [uniqueidentifier] NOT NULL,
    [Field4] [tinyint] NULL,
    [Field5] [nvarchar](max) NULL,
    [Field6] [bit] NOT NULL,
    [Field7] [int] NULL,

    CONSTRAINT [PK_B] 
        PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
                WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

ALTER TABLE [dbo].[B] WITH CHECK 
    ADD CONSTRAINT [...] 
        FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO

ALTER TABLE [dbo].[B] WITH CHECK 
    ADD CONSTRAINT [...] 
        FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO

ALTER TABLE [dbo].[B] WITH CHECK 
    ADD CONSTRAINT [...] 
        FOREIGN KEY([Field1]) REFERENCES [dbo].[...] ([Id])
GO

Infrastructure:

With the configuration that I mentioned, insert operation takes 1:55 min, I tried to run only select without insert it takes only 3-5 sec, so that's mean a problem with inserting.

Solutions that I have already tried:

1. I have removed all indexes before inserting. that improves performance, takes only 45 sec, but indexes anyway should be created after inserting. Indexes recreation takes ~1 min so we get same ~1:55 min. And 45 sec is still a long time.

2. I tried to insert using batches (by 5000), this reduce only to 1:35 min.

Additional info: We can't increase "DTU" significantly because when the application works in regular mode it does not need more than 80% of this resource (S1 - 20 DTU)

Execution plan (with indexes): Execution plan with indexes

Wait stats (with indexes): Wait stats with indexes

Execution plan (without indexes): Execution plan without indexes

Questions:

I have found some information regarding this problem, and possible solutions:

  1. Table Partition for table 'B' and then use "Partition Switching" to move data from NOT partition table 'A' to Partition 'B'

  2. In-Memory OLTP. Can this help with this problem? Pros and cons?

  3. Or maybe you have your variant, how I can perform in this situation.

Thank you.

Upvotes: 2

Views: 2149

Answers (1)

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89091

The query runtime appears to be dominated by IO waits.

Here are the wait stats

enter image description here

PAGEIOLATCH_EX is a wait to write to disk, PAGEIOLATCH_SH is a wait to read from disk, and LOG_RATE_GOVERNOR is essentially also an IO wait, waiting to write to the log file. The IO and Log write limits on a 20DTU database are quite small, and the standard tier DTU model provisions only 1-4 IOPS/DTU, so that's under 100 IOPS.

So you can either

  1. Write less data

-by eliminating columns, especially the nvarchar(max) column if it's large

-by compressing the data using Page Compression or a Clustered Columnstore index, or by using the COMPRESS TSQL function for the nvarchar(max) column if it is large

or

  1. Provide more resources

-by scaling to a higher DTU, or VCore configuration, or a Serverless configuration with elastic scale

-by moving to Hyperscale which provides 100MB/S log throughput at every service level

-Moving this database into an Elastic Pool where it can share a larger pool of resources with other databases.

Table partioning won't reduce the amount of writes. And In-Memory OLTP is only available in the Premium/Business Critical Tier, which already has higher IOPS.

Upvotes: 3

Related Questions