SchwartzE
SchwartzE

Reputation: 2628

Parallel.ForEach to run function on list of items

I have a list of employees populated from a long string of IDs. I would like to add them the database in parallel to speed up processing. What is the correct syntax in vb.net to use parallel.foreach to call a function on each item in a list. Below is the serial way of doing it.

    Dim employees As New List(Of Employee)()
    For Each element As String In ids
        Dim emp As New Employee(element)
        employees.Add(emp)
    Next

    For Each emp In employees
        emp.AddToDatabase()
    Next

Upvotes: 0

Views: 1836

Answers (2)

Andrew Morton
Andrew Morton

Reputation: 25066

This looks like an XY problem where you have decided on a solution already.

I suggest that you would be better off taking a step back and finding out how to add all the data in one query, rather than firing off lots of queries as fast as possible. Generally, one query will be faster than many queries.

You state in a comment that "This is a simple example for a future more complicated problem," so I will give an example of adding data to a table which has columns for only filenames and keywords, with the expectation that you can expand it to your future use.

First, the definition of the table:

CREATE TABLE [dbo].[Keywords](
    [Filename] [nvarchar](256) NOT NULL,
    [Keyword] [nvarchar](64) NOT NULL
) ON [PRIMARY]

Then a user-defined table-type in the database:

CREATE TYPE [dbo].[filename_keyword_tbltype] AS TABLE(
    [Filename] [nvarchar](256) NOT NULL,
    [Keyword] [nvarchar](64) NOT NULL,
    PRIMARY KEY CLUSTERED -- may not apply to you
(
    [Keyword] ASC -- may not apply to you
)WITH (IGNORE_DUP_KEY = OFF) -- may not apply to you
)

It is necessary to set permissions on the user-defined table-type:

GRANT EXECUTE ON TYPE::[filename_tbltype] TO [APPROPRIATE\USER_GOES_HERE]

Now the database is set up to create a stored procedure to handle the data which will be sent to it:

CREATE PROCEDURE [dbo].[AddKeywords]
    @Data [dbo].[filename_keyword_tbltype] ReadOnly
AS
BEGIN
    SET NOCOUNT ON;

    INSERT INTO Keywords
    SELECT [Filename],[Keyword]
    FROM @Data;

END

And finally code can be written to send all the data in one go:

Imports System.Data.SqlClient
Imports System.Data.SqlTypes
' ...
sqlcmd = New SqlCommand("AddKeywords", sqlConn)
sqlcmd.CommandType = CommandType.StoredProcedure

Dim tvpData As New List(Of SqlDataRecord)
Dim tvpDataType(1) As SqlMetaData
tvpDataType(0) = New SqlMetaData("Filename", SqlDbType.NVarChar, 256)
tvpDataType(1) = New SqlMetaData("Keyword", SqlDbType.NVarChar, 64)

' metadata.Keywords is a List(Of String) with the keywords for a file
For Each kwd In metadata.Keywords
    Dim rec = New SqlDataRecord(tvpDataType)
    ' smallName is simply the filename
    rec.SetSqlString(0, smallName)
    rec.SetSqlString(1, kwd.ToLower(CultureInfo.InvariantCulture))
    tvpData.Add(rec)
Next

sqlParam = New SqlParameter("@Data", SqlDbType.Structured)
sqlParam.TypeName = "filename_keyword_tbltype"
sqlParam.Value = tvpData

sqlcmd.Parameters.Add(sqlParam)

sqlConn.Open()
sqlcmd.ExecuteNonQuery()
sqlConn.Close()

Note that all the data is sent in one go, be it one record or a thousand.

Reference: Arrays and Lists in SQL Server 2008 - Using Table-Valued Parameters

Upvotes: 0

Guffa
Guffa

Reputation: 700840

The syntax would be:

Parallel.ForEach(employees, Sub(emp) emp.AddToDatabase())

However, it's doubtful that doing database inserts in parallel will actually be faster. The only possible speedup is in the overhead until the actual insert, in the end the database can only do one insert at a time to the table.

One thing that you could try to speed up the inserts would be to put several inserts in one query, i.e. group the employees into small batches.

Upvotes: 4

Related Questions