noshitsherlock
noshitsherlock

Reputation: 1153

Linq query where in list, performance what is the best?

I have a simple linq query that gets a slug from a database for a product.

var query = from url in urlTable
            where url.ProductId == productId &&
            url.EntityName == entityName &&
            url.IsActive
            orderby url.Id descending
            select url.Slug

I am trying to optimize this, since it is run for every product and on a category page this is run x times the number of products.

I could do this (if i'm not mistaking), send in a list of products and do a new query.

var query = from url in urlTable
            where productList.Contains(url.ProductId) &&
            url.EntityName == entityName &&
            url.IsActive
            orderby url.Id descending
            select url.Slug

But I have read somewhere that the performance of Contains is bad. Is there any other way to do this? What is the best method performance wise?

Upvotes: 1

Views: 2145

Answers (3)

Shlomi Borovitz
Shlomi Borovitz

Reputation: 1700

var query = from productId in productList
            join url in urlTable on productId equals url.ProductId
            where url.IsActive
            orderby url.Id descending
            select url.Slug;

I believe this query would have a better performance.

Upvotes: -1

Maarten
Maarten

Reputation: 22945

The second option is better. I would add the product-id to the result so you can differentiate between products.

var query = from url in urlTable
            where productList.Contains(url.ProductId) &&
            url.IsActive
            orderby url.Id descending
            select new { ProductId, Slug }

Please note that your list of product-id's is converted to sql-parameters IN (@p1, @p2, @p3) and there is a maximum amount of sql-parameters per sql-query. I think limit is somewhere around 2000 parameter. So if you are quering for more than 2000 products, this solution will not work.

Upvotes: 1

Luaan
Luaan

Reputation: 63732

But I have read somewhere that the performance of Contains is bad.

I believe you're mixing this up with string.Contains, which indeed is a bad idea on large data sets, because it simply can't use any index at all.

In any case, why are you guessing on performance? You should profile and see what's better for yourself. Also, look at the SQL produced by each of the queries and look at their respective query plans.

Now, with that out of the way, the second query is better, simply because it grabs as much as it can during one query, thus removing a lot of the overhead. It isn't too noticeable if you're only querying two or three times, but once you get into say a hundred, you're in trouble. Apart from being better in the client-server communication, it's also better on the server, because it can use the index very effectively, rather than looking up X items one after another. Note that that's probably negligible for primary keys, which usually don't have a logarithmic access time.

Upvotes: 2

Related Questions