Cameron Castillo
Cameron Castillo

Reputation: 2832

Linq To SQL Performance on IndexOf

We are using LINQ to filter a SQL table to find records where data in a column are starting with any substring found in another table (a lookup table with a long list of prefixes).

We have tried StartsWith as per the solution in this link but it only works in EF.

But this post did give a correct solution for LinqToSQL.

Problem is that it is very slow on large tables. Our table with Surnames have a million records and the lookup table about 50 000.

Here is a sample of our current code. How can we increase the performance?

(All fields part of comparisons have already got indexes on)

var lookupList = LookupTable.Select(p => p.Prefix);
var q = from p in dc.Personnel
        where lookupList.Any(x => p.Surname.ToUpper().IndexOf(x) >= 0)
        select new 
        {
            Surname = p.Surname
        };

Upvotes: 0

Views: 245

Answers (1)

TomTom
TomTom

Reputation: 62093

Problem is that it is very slow on large tables

Problem is that whoever designed that did not know the basics of SQL.

IndexOf - or anything that does not start at the beginning of a string - kills sargeability. Period. Which means any index will not be used. Which means that basically you are making a table scan and on every table have to go through chars and compare them until... well. And ToUpper does or may not actually do anything - field comparison is determined on the SQL level by collation index, and unless that is "take upper/lower into account" you just have a non operation on your hand.

What you CAN do is:

  • Do those on a separate connection that uses a lower transaction guarantee (Readcommitted) so you do not leave locks on the table. Standard connection is serializable, TOTALLY unsuited for lookup tables.
    • you may want to hide the table and use an index that forwards ot the table using readcommmitted for this.
  • Or load them into memory ONCE then run that in parallel in memory, depends whether this is a server code.

At the end, looking IN strings is EXTREMELY work intensive and there is VERY little you can do and SQL is generally not optimized for this. This is neither an EF nor a Linq2SQL problem - pretty much as your driving license does not determine that your car is not an airplane. SQL generally does not handle this type of query decently.

Upvotes: 1

Related Questions