Nick
Nick

Reputation: 597

Inserting to SQL performance issues

I wrote a program some time ago that delimits and reads in pretty big text files. The program works but the problem is it basically freezes the computer and takes long time to finish. On average each text file has around 10K to 15K lines, and each line represents a new row in a SQL table.

Way my program works is I first read all of the lines (this is where delimiting happens) and store them in array, after that I go through each array element and insert them into SQL table. This is all done at once and I suspect is eating up to much memory which is causing the program to freeze the computer.

Here is my code for reading file:

private void readFile()
    {
        //String that will hold each line read from the file
        String line;

        //Instantiate new stream reader 
        System.IO.StreamReader file = new System.IO.StreamReader(txtFilePath.Text);

        try
        {
            while (!file.EndOfStream)
            {
                line = file.ReadLine();

                if (!string.IsNullOrWhiteSpace(line))
                {
                    if (this.meetsCondition(line))
                    {
                        badLines++;
                        continue;
                    } // end if

                    else
                    {
                        collection.readIn(line);
                        counter++;
                    } // end else
                } // end if

            } // end while

            file.Close();

        } // end try

        catch (Exception exceptionError)
        {
            //Placeholder
        }

Code for inserting:

for (int i = 0; i < counter; i++)
            {
                //Iterates through the collection array starting at first index and going through until the end
                //and inserting each element into our SQL Table
                //if (!idS.Contains(collection.getIdItems(i)))
                //{
                    da.InsertCommand.Parameters["@Id"].Value = collection.getIdItems(i);
                    da.InsertCommand.Parameters["@Date"].Value = collection.getDateItems(i);
                    da.InsertCommand.Parameters["@Time"].Value = collection.getTimeItems(i);
                    da.InsertCommand.Parameters["@Question"].Value = collection.getQuestionItems(i);
                    da.InsertCommand.Parameters["@Details"].Value = collection.getDetailsItems(i);
                    da.InsertCommand.Parameters["@Answer"].Value = collection.getAnswerItems(i);
                    da.InsertCommand.Parameters["@Notes"].Value = collection.getNotesItems(i);
                    da.InsertCommand.Parameters["@EnteredBy"].Value = collection.getEnteredByItems(i);
                    da.InsertCommand.Parameters["@WhereReceived"].Value = collection.getWhereItems(i);
                    da.InsertCommand.Parameters["@QuestionType"].Value = collection.getQuestionTypeItems(i);
                    da.InsertCommand.Parameters["@AnswerMethod"].Value = collection.getAnswerMethodItems(i);
                    da.InsertCommand.Parameters["@TransactionDuration"].Value = collection.getTransactionItems(i);
                    da.InsertCommand.ExecuteNonQuery();
                //}

                //Updates the progress bar using the i in addition to 1 
                _worker.ReportProgress(i + 1);

            } // end for

Upvotes: 2

Views: 143

Answers (5)

D Stanley
D Stanley

Reputation: 152634

As mentioned, using SqlBulkCopy will be faster than inserting one-by-one, but there are other things that you could look at:

  • Is there a clustered index on the table? If so will you be inserting rows with values in the middle of that index? It's much more efficient to add values at the end of a clustered index since otherwise it will have to rearrange data to insert in in the middle (this is only for CLUSTERED indexes). On example I've seen us using SSN as a clustered primary key. Since SSNs will be distributed randomly, you are rearranging the physical structure on virtually every insert. Having a date as part of the clustered key may be OK if you are MOSTLY inserting data at the end (e.g. adding daily records)
  • Are there a lot of indexes on that table? it may be more efficient to drop the indexes, add the data, and re-add the indexes after the inserts. (or just drop indexes you don't need)

Upvotes: 1

Steve
Steve

Reputation: 216353

As an example how to use the functionaloty of the SqlBulkCopy class, (It is just pseudocode to render the idea)

First change your collection class to host an internal DataTable, and in the constructor define the schema used by your readIn method

public class MyCollection
{
     private DataTable loadedData = null;
     public MyCollection()
     {
         loadedData = new DataTable();
         loadedData.Columns.Add("Column1", typeof(string));
         .... and so on for every field expected
     }

     // A property to return the collected data
     public DataTable GetData
     {
        get{return loadedData;}
     }

     public void readIn(string line)
     {
          // split the line in fields
          DataRow r = loadedData.NewRow();
          r["Column1"] = splittedLine[0];
          .... and so on
          loadedData.Rows.Add(r);
     }
}

Finally the code that upload the data to your server

using (SqlConnection connection = new SqlConnection(connectionString))
{
   connection.Open();
   using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
   {
      bulkCopy.DestinationTableName = "destinationTable";
      try
      {
         bulkCopy.WriteToServer(collection.GetData());
      }
      catch (Exception ex)
      {
         Console.WriteLine(ex.Message);
      }
   }
}

Upvotes: 1

dazedandconfused
dazedandconfused

Reputation: 3186

+1 for SqlBulkCopy as others have stated, but be aware that it requires INSERT permission. If you work in a strictly controlled environment, as I do, where you aren't allowed to use dynamic SQL an alternative approach is to have your stored proc use Table-Valued parameters. That way you can still pass in chunks of records and have the proc do the actual inserting.

Upvotes: 1

Rune Grimstad
Rune Grimstad

Reputation: 36340

If you can map your collection to a DataTable then you could use an SqlBulkCopy to import your data. SqlBulkCopy is the fastest way to import data from .Net into SqlServer.

Upvotes: 2

Jakub Konecki
Jakub Konecki

Reputation: 46018

Use SqlBulkCopy class for bulk inserts.

http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx

You will cut down the time to mere seconds.

Upvotes: 1

Related Questions