Reputation: 513
We've been querying data in the DataTable
in Linq mode, and we've never had any performance issues. Until the following case, it takes 600s+ with Linq, but it takes only 3s with DataTable.Select()
. The results are consistent. Intuition tells me that Linq shouldn’t be so slow, and there is something wrong with my operation, but I don’t know how to improve it, can anyone give me some advice?
GetDtTest_Base()
and GetDtTest_Info()
are DataTable
and DataSet
returned from Sql Server through SqlDataAdapter
in ADO.NET.
DataTable dtBase = GetDtTest_Base(); //4W rows
DataSet dsInfo = GetDtTest_Info(); //two Datatable 4W rows, 2K rows
//Normally, we use Linq all the time. In this scenario, it takes about 600 seconds to query through Linq
foreach (DataRow item in dtBase.Rows)
{
string pnum = item["pnum"].ToString();
string number = item["number"].ToString();
var query_Info1 = dsInfo.Tables[0].AsEnumerable()
.Where(w => w.Field<string>("pnum") == pnum && w.Field<string>("calleee164") == number)
.Select(s => s);
item["conn"] = query_Info1.Count() > 0 ? (query_Info1.First())["conn"] : 0;
item["total"] = query_Info1.Count() > 0 ? (query_Info1.First())["total"] : 0;
var query_Info2 = dsInfo.Tables[1].AsEnumerable()
.Where(w => w.Field<string>("pnum") == pnum && w.Field<string>("calleee164") == number)
.Select(s => s);
item["asCnt"] = query_Info2.Count() > 0 ? 1 : 0;
item["asTrunks"] = query_Info2.Count() > 0 ? (query_Info2.First())["trunks"] : null;
}
//After changing this query, it only took 3 seconds to query
foreach (DataRow item in dtBase.Rows)
{
string pnum = item["pnum"].ToString();
string number = item["number"].ToString();
DataRow[] query_Info1 = dsInfo.Tables[0].Select($"pnum='{pnum}' and calleee164='{number}'");
if (query_Info1 != null && query_Info1.Length >= 1)
{
item["conn"] = query_Info1[0]["conn"].ToString();
item["total"] = query_Info1[0]["total"].ToString();
}
else
{
item["conn"] = 0;
item["total"] = 0;
}
DataRow[] query_Info2 = dsInfo.Tables[1].Select($"pnum='{pnum}' and calleee164='{number}'");
if (query_Info2 != null && query_Info2.Length >= 1)
{
item["asCnt"] = 1;
item["asTrunks"] = query_Info2[0]["trunks"].ToString();
}
else
{
item["asCnt"] = 0;
item["asTrunks"] = null;
}
}
Upvotes: 1
Views: 7511
Reputation: 16848
A bit of a grave dig, but recently hit performance issues with DataTable.AsEnumerable()...
in old code that was resolved by converting to DataTable.AsEnumerable().ToList()...
.
+1 for Fabio's answer, definitely a cleaner more efficient approach, but if you don't want to mess too much with legacy code and profiling shows a hotpath in your linq query, try converting to a list and see how you go.
Upvotes: 0
Reputation: 32445
Alternative approach is to create a structure designed for lookups.
Original approach will iterate dsInfo
tables for every row in dtBase
, which is O(n*m). With lookup data structure finding row from dsInfo
tables will be one operation, which makes O(n).
var base = GetDtTest_Base();
var infoSet = GetDtTest_Info();
var firstLookup = info[0].AsEnumerable()
.ToLookup(row => (Num: row.Field<string>("pnum"), Callee: row.Field<string>("calleee164")));
var secondLookup = info[1].AsEnumerable()
.ToLookup(row => (Num: row.Field<string>("pnum"), Callee: row.Field<string>("calleee164")));
foreach (DataRow item in dtBase.Rows)
{
var pnum = item["pnum"].ToString();
var number = item["number"].ToString();
var key = (Num: pnum, Callee: number);
item["conn"] = firstLookup[key].Select(row => row.Field<int>("conn")).FirstOrDefault();
item["total"] = firstLookup[key].Select(row => row.Field<int>("total")).FirstOrDefault();
item["asCnt"] = secondLookup[key].Any() ? 1 : 0;
item["asTrunks"] = secondLookup[key].Select(row => row.Field<string>("trunks")).FirstOrDefault();
}
Upvotes: 3