Pratik Bhoir
Pratik Bhoir

Reputation: 2144

Efficient Converting DataTable to List in C#

I want to serialize the DataTable to a List in C#. My code is taking around 10 minutes for 100K data to fetch and convert to a list. Previously tried with sequential way but it three times more time to process. So, I tried Parallel and it saved 1/3 of time. But, still, it's too slow.

If anyone can help to make it fast. Any help is appreciated.

public async Task<List<DataPointModel>> GetDetails(HashParameterModel parameterModel)
{
    List<DataPointModel> result = new List<DataPointModel>();
    try
    {
        var query = "USP_GetDetailsFromMetaData";
        using (var sqlConnection = new SqlConnection(connectionString))
        {
            using (var sqlCommand = new SqlCommand(query, sqlConnection))
            {
                sqlCommand.CommandType = CommandType.StoredProcedure;
                sqlCommand.Parameters.AddWithValue("@GroupID", parameterModel.GroupID);
                sqlCommand.CommandTimeout = 0;
                sqlConnection.Open();
                DataTable dataTable = new DataTable();
                DataSet dataSet = new DataSet();
                SqlDataAdapter da = new SqlDataAdapter
                {
                    SelectCommand = sqlCommand
                };
                da.Fill(dataSet);

                dataTable = dataSet.Tables[0];
                DataTable dataTable1 = dataSet.Tables[1];
                var questionList = dataTable1.AsEnumerable();
                if (dataTable.Rows.Count > 0)
                {
                    Parallel.ForEach(dataTable.AsEnumerable(), new ParallelOptions { MaxDegreeOfParallelism = 5 }, row =>
                    {
                        DataPointModel model = new DataPointModel();
                        model.ID = Convert.ToInt32(row["ID"]);
                        model.GroupID = Convert.ToInt32(row["GroupID"]);
                        model.ProviderID = Convert.ToInt32(row["SalvageProviderID"]);
                        model.ClaimNumber = row["ClaimNumber"].ToString();
                        model.PolicyNumber = row["PolicyNumber"].ToString();
                        model.DataPoint1 = row["DataPoint1"].ToString();
                        model.DataPoint2 = row["DataPoint2"].ToString();
                        model.DataPoint3 = row["DataPoint3"].ToString();
                        model.DataPoint4 = row["DataPoint4"].ToString();
                        model.FirstName = row["FirstName"].ToString();
                        model.LastName = row["LastName"].ToString();
                        model.PrimaryDamage = row["PrimaryDamage"].ToString();
                        model.Type = row["TypeCode"].ToString();
                        model.LossDate = row["LossDate"].ToString();
                        model.QuestionList = (from p in questionList
                                              where p.Field<int>("ID") == model.ID
                                              select new QuestionResponseModel()
                                              {
                                                  QuestionID = p.Field<int>("QuestionID").ToString(),
                                                  Response = p.Field<string>("ResponseValue")
                                              }).ToList();
                        result.Add(model);
                    });
                }
            }
        }
    }
    catch (Exception ex)
    {
        throw ex;
    }

    return result;
}

The DataSet has two DataTable

DataTable dataTable = dataSet.Tables[0]; // Details
DataTable dataTable1 = dataSet.Tables[1]; // QUestionList

I think the time is consumed when its looping interlly for QuestionList, which may have around 120K rows. Any Suggestions

Upvotes: 0

Views: 550

Answers (2)

Kristian Fitzgerald
Kristian Fitzgerald

Reputation: 269

Are there duplicate model IDs in the Details table? If so this could help to avoid running the question list query multiple times:

                        model.QuestionList = getQuestions(model.ID);

Method:

    public Dictionary<int, List<QuestionResponseModel>> questionBuffer = new Dictionary<int, List<QuestionResponseModel>>();

    public List<QuestionResponseModel> getQuestions(int ID)
    {
        if (questionBuffer.ContainsKey(ID)) return questionBuffer[ID];

        List<QuestionResponseModel> questions =  (from p in questionList
                              where p.Field<int>("ID") == model.ID
                              select new QuestionResponseModel()
                              {
                                  QuestionID = p.Field<int>("QuestionID").ToString(),
                                  Response = p.Field<string>("ResponseValue")
                              }).ToList();
        questionBuffer.Add(ID, questions);
        return questions;
    }

Upvotes: 0

Rand Random
Rand Random

Reputation: 7440

An easy and quick to implement performance improvment would be to build a look up table from your questionList, and access this to fetch the question instead of doing this piece of code

model.QuestionList = (from p in questionList
                      where p.Field<int>("ID") == model.ID
                      select new QuestionResponseModel()
                      {
                          QuestionID = p.Field<int>("QuestionID").ToString(),
                          Response = p.Field<string>("ResponseValue")
                      }).ToList();

So add the following

var questionList = dataTable1.AsEnumerable();
//maybe add .AsParallel() - questionList.AsParallel().ToLookUp(...)
var questionLookUp = questionList.ToLookUp(x => x.Field<int>("ID"), x => new QuestionResponseModel() { QuestionID = x.Field<int>("QuestionID"), Response = p.Field<string>("ResponseValue") });

And than use it like this

model.QuestionList = questionLookUp[model.ID].ToList();

https://learn.microsoft.com/en-us/dotnet/api/system.linq.lookup-2

https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.tolookup

Code is untested, hope I didn't make to many errors.

Upvotes: 1

Related Questions