paparazzo
paparazzo

Reputation: 45106

Insert Speed Deteriorates

The basic problem is insert rate deteriorates as data is loaded.

But still have deteriorating insert speed.

During the load have a limited number of users. So can't take down the PK.

The data is loaded in the order of the PK but that index still fragments.

Composite PK of Int, TinyInt, String.

Before the load rebuilt the index with a fill factor of 100%. The table has one other index that is disabled.

Now with 20,00 rows loaded into the table the PK index already has 4% fragmentation. It continues to fragment and load speed deteriorates. Check fragmentation with DBCC SHOWCONTIG ('docMVtext', 'PK_docMVtext')

The Extent Scan Fragmentation is high and maybe I should open a second question.

I don't think it is the varchar(600) value as part of the PK as there is a sister table docSVtext that only has PK on int tinyint and suffers from the same problem.

Created this database from another via backup restore. Needed the same configuration tables but not the same data tables. Deleted the data from the data tables and ran shrinkdb TRUNCATEONLY.

Using insert values (), (), () to load the values.

Thought that possibly the values (), (), () was changing the order so changed the .NET to one insert per row and still got fragmentation on the PK even with data inserted by order of the PK.

Have triple checked that data is inserted in the order of the PK.

In the .NET app I am using LINQ to sort the data prior to the insert. In debug I reviewed 40 and they were all sorted properly.

Even created a mirror table with an iden then the same three columns. Used the same insert on that mirror table to validate the insert order. When I select on the mirror table sorted by iden the data is in sorted order. This is just a secondary test that I inserted the data sorted by the PK.

Below is the table definition. (Yes I know in the first paragraph states no FK constraints and this shows FK constraint. When I remove the FK contstraints no help on insert speed.)

USE [Gabe2a_ENCORE]
GO

/****** Object:  Table [dbo].[docMVtext]    Script Date: 08/12/2012 20:13:35 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[docMVtext](
    [sID] [int] NOT NULL,
    [fieldID] [tinyint] NOT NULL,
    [value] [varchar](600) NOT NULL,
 CONSTRAINT [PK_docMVtext] PRIMARY KEY CLUSTERED 
(
    [sID] ASC,
    [fieldID] ASC,
    [value] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 100) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

ALTER TABLE [dbo].[docMVtext]  WITH CHECK ADD  CONSTRAINT [FK_docMVtext_docFieldDef] FOREIGN KEY([fieldID])
REFERENCES [dbo].[docFieldDef] ([ID])
GO

ALTER TABLE [dbo].[docMVtext] CHECK CONSTRAINT [FK_docMVtext_docFieldDef]
GO

ALTER TABLE [dbo].[docMVtext]  WITH NOCHECK ADD  CONSTRAINT [FK_docMVtext_docSVsys] FOREIGN KEY([sID])
REFERENCES [dbo].[docSVsys] ([sID])
GO

ALTER TABLE [dbo].[docMVtext] CHECK CONSTRAINT [FK_docMVtext_docSVsys]
GO

What baffles me is that after this initial load I parse and index text to create a simple full text search index. To load those tables I use the same strategy of sort in memory and insert in the order of the PK and I get zero fragmentation of the PK there. I cannot figure what is different about the initial load to get this fragmentation on the PK.

I know people are not going to believe this but the major bottle neck was in the first table.

The first code below is 10x faster than the second with 300,000 rows in the table. At 1.6 million rows the first is 30x faster. Serves me right for using a sloppy @@identity in the first place.

SQLcmd.CommandText = commandText + ";  SELECT SCOPE_IDENTITY() ";
sID = int.Parse((SQLcmd.ExecuteScalar().ToString()));


SQLcmd.CommandText = commandText;
rowsRet = SQLcmd.ExecuteNonQuery();
if (rowsRet == 1)
{
    commandText = "select @@identity from [docSVsys]";
    SQLcmd.CommandText = commandText;
    sID = int.Parse(SQLcmd.ExecuteScalar().ToString());
}

Upvotes: 2

Views: 139

Answers (1)

Brian Dishaw
Brian Dishaw

Reputation: 5825

Since this is a bulk insert for initialization, I would recomend droping all indexes on the table. By doing this you get the data into the server as quickly as possible. After that, apply the PK and non-clustered indexes when the data loading is completed. Sql server will be able to handle rearranging the data all at once instead of each time it reaches the fill factor for a given page.

Upvotes: 1

Related Questions