Chris Hayes
Chris Hayes

Reputation: 4037

SqlBulkCopy slow as molasses

I'm looking for the fastest way to load bulk data via c#. I have this script that does the job but slow. I read testimonies that SqlBulkCopy is the fastest.
1000 records 2.5 seconds. files contain anywhere near 5000 records to 250k What are some of the things that can slow it down?

Table Def:

CREATE TABLE [dbo].[tempDispositions](
    [QuotaGroup] [varchar](100) NULL,
    [Country] [varchar](50) NULL,
    [ServiceGroup] [varchar](50) NULL,
    [Language] [varchar](50) NULL,
    [ContactChannel] [varchar](10) NULL,
    [TrackingID] [varchar](20) NULL,
    [CaseClosedDate] [varchar](25) NULL,
    [MSFTRep] [varchar](50) NULL,
    [CustEmail] [varchar](100) NULL,
    [CustPhone] [varchar](100) NULL,
    [CustomerName] [nvarchar](100) NULL,
    [ProductFamily] [varchar](35) NULL,
    [ProductSubType] [varchar](255) NULL,
    [CandidateReceivedDate] [varchar](25) NULL,
    [SurveyMode] [varchar](1) NULL,
    [SurveyWaveStartDate] [varchar](25) NULL,
    [SurveyInvitationDate] [varchar](25) NULL,
    [SurveyReminderDate] [varchar](25) NULL,
    [SurveyCompleteDate] [varchar](25) NULL,
    [OptOutDate] [varchar](25) NULL,
    [SurveyWaveEndDate] [varchar](25) NULL,
    [DispositionCode] [varchar](5) NULL,
    [SurveyName] [varchar](20) NULL,
    [SurveyVendor] [varchar](20) NULL,
    [BusinessUnitName] [varchar](25) NULL,
    [UploadId] [int] NULL,
    [LineNumber] [int] NULL,
    [BusinessUnitSubgroup] [varchar](25) NULL,
    [FileDate] [datetime] NULL
) ON [PRIMARY]

and here's the code

    private void BulkLoadContent(DataTable dt)
    {
        OnMessage("Bulk loading records to temp table");
        OnSubMessage("Bulk Load Started");
        using (SqlBulkCopy bcp = new SqlBulkCopy(conn))
        {
            bcp.DestinationTableName = "dbo.tempDispositions";
            bcp.BulkCopyTimeout = 0;
            foreach (DataColumn dc in dt.Columns)
            {
                bcp.ColumnMappings.Add(dc.ColumnName, dc.ColumnName);
            }
            bcp.NotifyAfter = 2000;
            bcp.SqlRowsCopied += new SqlRowsCopiedEventHandler(bcp_SqlRowsCopied);
            bcp.WriteToServer(dt);
            bcp.Close();
        }
    }

Upvotes: 7

Views: 15234

Answers (7)

Paul Suart
Paul Suart

Reputation: 6713

I just had a similar experience. SqlBulkCopy was fine with one table, but with another it was at least 10x slower to copy the same number of records.

The slow table did have non-clustered indexes and a trigger, but disabling them didn't make any appreciable difference.

It turns out the slow table had six NVARCHAR(MAX) columns. The data I was actually inserting was all fairly short, so I changed the columns to NVARCHAR(500).

Et voilà! The performance of the slow table improved to match the other table - at least a 10x speed improvement.

Upvotes: 1

John Gietzen
John Gietzen

Reputation: 49544

Setting BatchSize or NotifyAfter will both cause a slow down.

If you have a data reader, you can wrap it with a row-counting data reader to get the progress.

using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
{
    bulkCopy.EnableStreaming = true;
    bulkCopy.BulkCopyTimeout = 0;
    using (var reader = new MyRowCountDataReader(dataReader))
    {
        reader.NotifyAfter = 1000;
        reader.RowRead += (sender, args) => { ... };
        bulkCopy.WriteToServer(reader);
    }
}

Upvotes: 1

Shannon Holsinger
Shannon Holsinger

Reputation: 2341

I've noticed that trying to flush large datasets is initially much faster, but slows down substantially over time. I've found a modest increase in performance using a buffered approach, feeding bulkcopy just a few thousand records at a time under the same connection. It seems to keep the per-batch transaction time down over time, which (over time), improves performance. On my solution, I've noted that the same method un-buffered will save about 5,000,000 records in the time it takes this method to save about 7,500,000 records of the same type to the same DB. Hope this helps someone.

public void flush_DataTable(DataTable dt, string tableName)//my incoming DTs have a million or so each and slow down over time to nothing. This helps.
    {  int bufferSize = 10000;
        int bufferHigh = bufferSize;
        int lowBuffer = 0;
        if (dt.Rows.Count >= bufferSize)
        {  using (SqlConnection conn = getConn())
            {   conn.Open();
                while (bufferHigh < dt.Rows.Count)
                {
                    using (SqlBulkCopy s = new SqlBulkCopy(conn))
                    {   s.BulkCopyTimeout = 900;
                        s.DestinationTableName = tableName;
                        s.BatchSize = bufferSize;

                        s.EnableStreaming = true;
                        foreach (var column in dt.Columns)
                            s.ColumnMappings.Add(column.ToString(), column.ToString());
                        DataTable bufferedTable = dt.Clone();
                        for (int bu = lowBuffer; bu < bufferHigh; bu++)
                        {
                            bufferedTable.ImportRow(dt.Rows[bu]);
                        }
                        s.WriteToServer(bufferedTable);
                        if (bufferHigh == dt.Rows.Count)
                        {
                            break;
                        }
                        lowBuffer = bufferHigh;
                        bufferHigh += bufferSize;

                        if (bufferHigh > dt.Rows.Count)
                        {
                            bufferHigh = dt.Rows.Count;
                        }
                    }
                }
                conn.Close();
            }
        }
        else
        {
            flushDataTable(dt, tableName);//perofrm a non-buffered flush (could just as easily flush the buffer here bu I already had the other method 
        }
    }

Upvotes: 1

playful
playful

Reputation: 1774

The IDataReader implementation I sugested here How to implement IDataReader? maybe helps you. I used it with SqlBulkCopy as follows:

using (MyFileDataReader reader = new MyFileDataReader(@"C:\myfile.txt"))
 {
      SqlBulkCopy bulkCopy = new SqlBulkCopy(connection);
      bulkCopy.DestinationTableName = "[my_table]";
      bulkCopy.BatchSize = 10000;

      bulkCopy.WriteToServer(reader);

      bulkCopy.Close();

 } 

Upvotes: 0

zszep
zszep

Reputation: 4458

If you have lots of data, setting the batchsize to a reasonably large number might help:

bcp.BatchSize = 10000;

Upvotes: 4

Jipy
Jipy

Reputation: 115

Things that can slow down the bulk copy : -Full text indexes on the table -Triggers on Insert -Foreign-Key constraints

Upvotes: 1

Frederik
Frederik

Reputation: 2971

Do you have any indexes, triggers or constraints on that table?

That will cause slowdowns on insert - especially a clustered index would hurt. When blasting the amounts of data you're doing, it's best to drop indexes first, and re-apply them afterwards.

A good post about it is here: What's the fastest way to bulk insert a lot of data in SQL Server (C# client)

Upvotes: 8

Related Questions