Reputation: 45
I have two tables, first one 'A' contains approximately 400k rows and table 'B' - 12k rows. I need to select ~350k rows from table A and INSERT to table B.
I do this operation in a stored procedure (because I need to do a bunch of other tasks):
INSERT INTO B ("fields")
SELECT "field"
FROM A
INNER JOIN @TempTable -- this join need for filtering records in table A
Table structure
Table A:
CREATE TABLE [dbo].[A]
(
[Field1] [uniqueidentifier] NOT NULL,
[Field2] [int] NOT NULL,
[Field3] [uniqueidentifier] NOT NULL,
[Field4] [nvarchar](max) NULL,
[Field5] [bit] NOT NULL,
[Field6] [int] NULL,
[Field7] [tinyint] NULL,
CONSTRAINT [PK_A]
PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [FK_...]
FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[A] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field1]) REFERENCES [dbo].[A] ([Id])
GO
Table B:
CREATE TABLE [dbo].[B]
(
[Field1] [uniqueidentifier] NOT NULL,
[Field2] [int] NOT NULL,
[Field3] [uniqueidentifier] NOT NULL,
[Field4] [tinyint] NULL,
[Field5] [nvarchar](max) NULL,
[Field6] [bit] NOT NULL,
[Field7] [int] NULL,
CONSTRAINT [PK_B]
PRIMARY KEY CLUSTERED ([Field1] ASC, [Field2] ASC, [Field3] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field2]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field3]) REFERENCES [dbo].[...] ([Id])
GO
ALTER TABLE [dbo].[B] WITH CHECK
ADD CONSTRAINT [...]
FOREIGN KEY([Field1]) REFERENCES [dbo].[...] ([Id])
GO
Infrastructure:
With the configuration that I mentioned, insert operation takes 1:55 min, I tried to run only select without insert it takes only 3-5 sec, so that's mean a problem with inserting.
Solutions that I have already tried:
1. I have removed all indexes before inserting. that improves performance, takes only 45 sec, but indexes anyway should be created after inserting. Indexes recreation takes ~1 min so we get same ~1:55 min. And 45 sec is still a long time.
2. I tried to insert using batches (by 5000), this reduce only to 1:35 min.
Additional info: We can't increase "DTU" significantly because when the application works in regular mode it does not need more than 80% of this resource (S1 - 20 DTU)
Execution plan (with indexes): Execution plan with indexes
Wait stats (with indexes): Wait stats with indexes
Execution plan (without indexes): Execution plan without indexes
I have found some information regarding this problem, and possible solutions:
Table Partition for table 'B' and then use "Partition Switching" to move data from NOT partition table 'A' to Partition 'B'
In-Memory OLTP. Can this help with this problem? Pros and cons?
Or maybe you have your variant, how I can perform in this situation.
Thank you.
Upvotes: 2
Views: 2149
Reputation: 89091
The query runtime appears to be dominated by IO waits.
Here are the wait stats
PAGEIOLATCH_EX is a wait to write to disk, PAGEIOLATCH_SH is a wait to read from disk, and LOG_RATE_GOVERNOR is essentially also an IO wait, waiting to write to the log file. The IO and Log write limits on a 20DTU database are quite small, and the standard tier DTU model provisions only 1-4 IOPS/DTU, so that's under 100 IOPS.
So you can either
-by eliminating columns, especially the nvarchar(max)
column if it's large
-by compressing the data using Page Compression or a Clustered Columnstore index, or by using the COMPRESS TSQL function for the nvarchar(max)
column if it is large
or
-by scaling to a higher DTU, or VCore configuration, or a Serverless configuration with elastic scale
-by moving to Hyperscale which provides 100MB/S log throughput at every service level
-Moving this database into an Elastic Pool where it can share a larger pool of resources with other databases.
Table partioning won't reduce the amount of writes. And In-Memory OLTP is only available in the Premium/Business Critical Tier, which already has higher IOPS.
Upvotes: 3