Josh Cooper
Josh Cooper

Reputation: 15

Efficiently make iterative LINQ to SQL queries

I am trying to implement the following algorithm using LINQ-to-SQL:

Given a list of strings L, return every row R in the DB for which every string in L is a substring of one of the column values in R.

The question is how do I do this iteratively for every string in L? I don't know how I can slickly put it all into one Linq-To-SQL statement. Note that I have no problem writing code along the lines of:

field1.contains(...) || field2.contains(...) || ...

as there are not that many columns.

E.g., if the input is

["Charlie", "Doctor", "Kor"]

we would output all rows that have a field with "Charlie" as a substring, a field with "Doctor" as a substring, and a field with "Kor" as a substring.

One approach I thought of was to make separate SQL queries for each input value and to take the intersection of all of those.

Another approach is to pick just one of the strings from the input, make a SQL query on that, convert it to a list, and filter out the rest of the strings one at a time using just LINQ in C#.

Any thoughts on an optimal way to do this?

Upvotes: 1

Views: 150

Answers (2)

Ivan Stoev
Ivan Stoev

Reputation: 205629

I would try All extension method (EF6 supports it, not sure about LINQ to SQL):

List<string> values = new List<string> { "Charlie", "Doctor", "Kor" };
var query = db.Table
    .Where(r => values.All(v => r.Field1.Contains(v) || r.Field2.Contains(v) || ...));

Update: Well, the assumption was wrong - as mentioned in the comments, unfortunately LINQ to SQL does not support the above construct (shame on them).

As usual in such cases, I would build dynamically a corresponding predicate expression.

In this particular case we need something like this (for N fields and M values):

r => (r.Field1.Contains(value1) || r.Field2.Contains(value1) ... || r.FieldN.Contains(value1))
  && (r.Field1.Contains(value2) || r.Field2.Contains(value2) ... || r.FieldN.Contains(value2))
  ...
  && (r.Field1.Contains(valueM) || r.Field2.Contains(valueM) ... || r.FieldN.Contains(valueM));

And here is a custom extension method which does that:

public static class QueryableExtensions
{
    public static IQueryable<T> WhereContainsAll<T>(
        this IQueryable<T> source, 
        IEnumerable<string> values,
        params Expression<Func<T, string>>[] members)
    {
        var parameter = Expression.Parameter(typeof(T), "r");
        var body = values
            .Select(value => members
                .Select(member => (Expression)Expression.Call(
                    Expression.MakeMemberAccess(parameter, ((MemberExpression)member.Body).Member),
                    "Contains", Type.EmptyTypes, Expression.Constant(value)))
                .Aggregate(Expression.OrElse))
            .Aggregate(Expression.AndAlso);
        var predicate = Expression.Lambda<Func<T, bool>>(body, parameter);
        return source.Where(predicate);
    }
}

and the sample usage would be

List<string> values = new List<string> { "Charlie", "Doctor", "Kor" };
var query = db.Table.WhereContainsAll(values,
    r => r.Field1, r => r.Field2, r => r.Field3, ...);

which should lead to a single SQL query which IMO should be optimal because the heavy work will be done by the database engine. Of course the query most likely will cause full table scan, but the same will happen even with single Contains (SQL LIKE) criteria.

Upvotes: 2

Tomas Chabada
Tomas Chabada

Reputation: 3019

Try this (I made an example using Lists):

var dbValues = new List<string> {"hello", "how", "are", "you"};
var substrings = new List<string> {"ello", "re"};

var result = dbValues.Where(i => substrings.Any(l => i.Contains(l))).ToList();

Result will contain {"hello","are"}

Example with database:

using (var db = new MyDatabase())
{
    var substrings = new List<string> { "ello", "re" };
    var result = db.MyTable.Where(i => substrings.Any(l => i.Value.Contains(l))).ToList();
}

Upvotes: -1

Related Questions