Brijesh Mishra
Brijesh Mishra

Reputation: 2748

Why there is significant performance gain when doing multiple insert over single insert in a command

I want to insert around 3000 records, when I go by approach 1 it takes around 2 min to complete, however if i use approach 2 insert completes in less than second. Though approach 2 doesn't adhere to good practice but its giving me good performance gain. Would like to understand why approach 1 takes so much time and can there be a better way to do this

Approach 1:

public static void InsertModelValue(DataSet employeData, int clsaId)
{
    var query = @"INSERT INTO employee (id, name)
                  VALUES (@id, @name)";
    using (var connection = GetOdbcConnection())
    {                      
        connection.Open();                
        var tran = connection.BeginTransaction();
        try
        {                   

            foreach (DataRow row in employeData.Tables[0].Rows)
            {                       
                using (var cmd = new OdbcCommand(query, connection, tran))
                {
                    cmd.Parameters.Add("@id", OdbcType.VarChar).Value = row["ID"];
                    cmd.Parameters.Add("@name", OdbcType.Int).Value = Convert.ToInt32(row["Name"]);
                    cmd.ExecuteNonQuery();
                }
             }
            tran.Commit();
        }
        catch
        {
            tran.Rollback();
            throw;
        }                      
   }          
}

Approach 2:

public static void InsertModelValueInBulk(DataSet employeData, int clsaId, int batchSize)
{          
    string[] insertStatement = new string[batchSize];
    using (var connection = GetOdbcConnection())
    {
        connection.Open();
        var tran = connection.BeginTransaction();
        try
        {                               
            int j = 0;
            for (int i = 0; i < employeData.Tables[0].Rows.Count; i++)
            {
                var row = employeData.Tables[0].Rows[i];      
                var insertItem = string.Format(@"select '{0}',{1}", row["name"], Convert.ToInt32(row["ID"]);
                insertStatement[j] = insertItem;
                if (j % (batchSize-1) == 0 && j > 0)
                {
                    var finalQuery = @" INSERT INTO employee (id, name)
     " + String.Join(" union ", insertStatement);
                    using (var cmd = new OdbcCommand(finalQuery, connection, tran))
                    {
                        cmd.ExecuteNonQuery();
                    }
                    j = 0;
                    continue;
                }
                else
                {
                    j = j + 1;
                }
            }

            if (j > 0)
            {

                var finalQuery = @"INSERT INTO employee (id, name)
     " + String.Join(" union ", insertStatement,0,j-1);
                using (var cmd = new OdbcCommand(finalQuery, connection, tran))
                {
                    cmd.ExecuteNonQuery();
                }
            }

            tran.Commit();
        }
        catch
        {
            tran.Rollback();
            throw;
        }
    }
}

Upvotes: 2

Views: 2226

Answers (1)

Eric Lippert
Eric Lippert

Reputation: 660128

You want to deposit three thousand dollars in your bank account. Which is faster:

  • wait for a teller
  • take a dollar out of your wallet
  • show your id to the teller
  • deposit the dollar
  • go to the end of the line
  • repeat the whole process 2999 more times, then go home.

or

  • wait for a teller
  • take three thousand dollars out of your wallet
  • show your id to the teller
  • deposit the three thousand dollars
  • go home

?

It should be fairly obvious that the first one is a lot slower than the second one. Now is it clear why the first technique is hundreds of times slower than the second?

Upvotes: 26

Related Questions