Brucelin Michael
Brucelin Michael

Reputation: 513

DataTable.AsEnumerable().Where() vs DataTable.Select()

We've been querying data in the DataTable in Linq mode, and we've never had any performance issues. Until the following case, it takes 600s+ with Linq, but it takes only 3s with DataTable.Select(). The results are consistent. Intuition tells me that Linq shouldn’t be so slow, and there is something wrong with my operation, but I don’t know how to improve it, can anyone give me some advice?

GetDtTest_Base() and GetDtTest_Info() are DataTable and DataSet returned from Sql Server through SqlDataAdapter in ADO.NET.

DataTable dtBase = GetDtTest_Base();  //4W rows
DataSet dsInfo = GetDtTest_Info();    //two Datatable  4W rows, 2K rows

//Normally, we use Linq all the time. In this scenario, it takes about 600 seconds to query through Linq
foreach (DataRow item in dtBase.Rows)
{
    string pnum = item["pnum"].ToString();
    string number = item["number"].ToString();

    var query_Info1 = dsInfo.Tables[0].AsEnumerable()
        .Where(w => w.Field<string>("pnum") == pnum && w.Field<string>("calleee164") == number)
        .Select(s => s);
    item["conn"] = query_Info1.Count() > 0 ? (query_Info1.First())["conn"] : 0;
    item["total"] = query_Info1.Count() > 0 ? (query_Info1.First())["total"] : 0;

    var query_Info2 = dsInfo.Tables[1].AsEnumerable()
        .Where(w => w.Field<string>("pnum") == pnum && w.Field<string>("calleee164") == number)
        .Select(s => s);
    item["asCnt"] = query_Info2.Count() > 0 ? 1 : 0;
    item["asTrunks"] = query_Info2.Count() > 0 ? (query_Info2.First())["trunks"] : null;
}

//After changing this query, it only took 3 seconds to query
foreach (DataRow item in dtBase.Rows)
{
    string pnum = item["pnum"].ToString();
    string number = item["number"].ToString();

    DataRow[] query_Info1 = dsInfo.Tables[0].Select($"pnum='{pnum}' and calleee164='{number}'");
    if (query_Info1 != null && query_Info1.Length >= 1)
    {
        item["conn"] = query_Info1[0]["conn"].ToString();
        item["total"] = query_Info1[0]["total"].ToString();
    }
    else
    {
        item["conn"] = 0;
        item["total"] = 0;
    }

    DataRow[] query_Info2 = dsInfo.Tables[1].Select($"pnum='{pnum}' and calleee164='{number}'");
    if (query_Info2 != null && query_Info2.Length >= 1)
    {
        item["asCnt"] = 1;
        item["asTrunks"] = query_Info2[0]["trunks"].ToString();
    }
    else
    {
        item["asCnt"] = 0;
        item["asTrunks"] = null;
    }
}

Upvotes: 1

Views: 7511

Answers (2)

si618
si618

Reputation: 16848

A bit of a grave dig, but recently hit performance issues with DataTable.AsEnumerable()... in old code that was resolved by converting to DataTable.AsEnumerable().ToList()....

+1 for Fabio's answer, definitely a cleaner more efficient approach, but if you don't want to mess too much with legacy code and profiling shows a hotpath in your linq query, try converting to a list and see how you go.

Upvotes: 0

Fabio
Fabio

Reputation: 32445

Alternative approach is to create a structure designed for lookups.

Original approach will iterate dsInfo tables for every row in dtBase, which is O(n*m). With lookup data structure finding row from dsInfo tables will be one operation, which makes O(n).

var base = GetDtTest_Base();
var infoSet = GetDtTest_Info();

var firstLookup = info[0].AsEnumerable()
    .ToLookup(row => (Num: row.Field<string>("pnum"), Callee: row.Field<string>("calleee164")));
var secondLookup = info[1].AsEnumerable()
    .ToLookup(row => (Num: row.Field<string>("pnum"), Callee: row.Field<string>("calleee164")));

foreach (DataRow item in dtBase.Rows)
{
    var pnum = item["pnum"].ToString();
    var number = item["number"].ToString();
    var key = (Num: pnum, Callee: number);

    item["conn"] = firstLookup[key].Select(row => row.Field<int>("conn")).FirstOrDefault();
    item["total"] = firstLookup[key].Select(row => row.Field<int>("total")).FirstOrDefault();

    item["asCnt"] = secondLookup[key].Any() ? 1 : 0;
    item["asTrunks"] = secondLookup[key].Select(row => row.Field<string>("trunks")).FirstOrDefault();
}

Upvotes: 3

Related Questions