Reputation: 5507
I have an IEnumerable entity which holds around 100 thousand records. I wanted to perform Parallel.ForEach to insert these data.
Say here is the class what I have: Employee.cs
SqlConneciton conn = base.GetConnection();
conn.open();
IEnumerable<Employee> employeeList = GetListofEmployeesFromDB();
Parallel.ForEach(employeeList
, employee =>
{
employee.add(conn, sqlTransaction);
});
Empployee.cs
{
public void add(SqlConnection conn, SqlTransaction sqlTransaction)
{
using (SqlCommand insertCmd = new SqlCommand("EmployeeInsert", conn))
{
insertCmd.CommandType = CommandType.StoredProcedure;
insertCmd.Transaction = transaction;
insertCmd.Parameters["@Name"].Value = this.Name;
insertCmd.ExecuteNonQuery();
this.id = (int)insertCmd.Parameters["@Id"].Value;
}
}
}
As the data inserts, I see that there is a NPE at:
this.id = (int)insertCmd.Parameters["@Id"].Value;
Not sure I i am missing something or not. here is the exception that i see.
System.AggregateException was unhandled
Message=AggregateException_ctor_DefaultMessage
Source=System.Threading
StackTrace:
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleBody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 body)
:
:
:
at System.Threading.ExecutionContext.runTryCode(Object userData)
at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart(Object obj)
InnerException: System.NullReferenceException
Message=Object reference not set to an instance of an object.
Source=Jobvite.Library
StackTrace:
:
:
:
at System.Threading.Tasks.Parallel.<>c__DisplayClass32`2.<PartitionerForEachWorker>b__30()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass3.<ExecuteSelfReplicating>b__2(Object )
InnerException:
Upvotes: 1
Views: 1142
Reputation: 1248
System.AggregateException
is raised for a possibility of multiple Exceptions being raised from the application.
Reason
You are accessing the Connection
object in parallel mode. Multiple tasks are trying to access it at the sametime and raising exception when they couldn't get hold of it. Only one thread can access the db connection at a moment.
Creating multiple threads to insert data into DB will not speed things up anyway. (Even if you manage to find any parallel method) because the DB would locked for each write of data and all the data will be inserted sequentially.
Go with a normal insert process and it will be much faster.
Upvotes: 3
Reputation: 2540
(Once I figured out what "1 lac" is) It looks like you want to do a Bulk Insert. You can do that using SqlBulkCopy - it was designed to efficiently load a SQL table.
However, I see you also want the ids back so the above won't get you all the way. I see you are using a stored procedure so one way of doing it (assuming you have SQL 2008 and above):
Create a table-valued data type to contain the data you want to insert.
CREATE TYPE [dbo].[EmployeeDataType] As Table
(
ID INT,
-- employee details
)
Change your stored procedure to use this table valued parameter as input and when it performs the insert, it does and OUTPUT. e.g.
CREATE PROCEDURE [dbo].[EmployeeInsert]
(
@EmployeeInsertParameter As [dbo].[EmployeeDataType] READONLY
)
AS
...
INSERT INTO Employee
SELECT * FROM @EmployeeInsertParameter e
OUTPUT INSERTED.*
(Obviously you would name the columns and not use *)
Change your code to not use Parallel.ForEach
and instead do this:
DataTable employeeDataTable = new DataTable("EmployeeDataType");
// fill in the rows using
...
insertCmd.Parameters["@EmployeeInsertParameter"].Value = employeeDataTable;
...
Read the result of the stored procedure execution into List<Employee>
Conclusion: Basically don't use Parallel.For
for DB connections. This way will have you use one connection correctly (without resulting in "NPEs") and most of the processing will be done in memory and as long as you have the RAM, it will be orders of magnitude quicker.
Here is another example possible way, but is more involved: https://stackoverflow.com/a/21689413/3419825
Upvotes: 0