Reputation: 1153
I have a simple linq query that gets a slug from a database for a product.
var query = from url in urlTable
where url.ProductId == productId &&
url.EntityName == entityName &&
url.IsActive
orderby url.Id descending
select url.Slug
I am trying to optimize this, since it is run for every product and on a category page this is run x times the number of products.
I could do this (if i'm not mistaking), send in a list of products and do a new query.
var query = from url in urlTable
where productList.Contains(url.ProductId) &&
url.EntityName == entityName &&
url.IsActive
orderby url.Id descending
select url.Slug
But I have read somewhere that the performance of Contains is bad. Is there any other way to do this? What is the best method performance wise?
Upvotes: 1
Views: 2145
Reputation: 1700
var query = from productId in productList
join url in urlTable on productId equals url.ProductId
where url.IsActive
orderby url.Id descending
select url.Slug;
I believe this query would have a better performance.
Upvotes: -1
Reputation: 22945
The second option is better. I would add the product-id
to the result so you can differentiate between products.
var query = from url in urlTable
where productList.Contains(url.ProductId) &&
url.IsActive
orderby url.Id descending
select new { ProductId, Slug }
Please note that your list of product-id's is converted to sql-parameters IN (@p1, @p2, @p3)
and there is a maximum amount of sql-parameters per sql-query. I think limit is somewhere around 2000 parameter. So if you are quering for more than 2000 products, this solution will not work.
Upvotes: 1
Reputation: 63732
But I have read somewhere that the performance of Contains is bad.
I believe you're mixing this up with string.Contains
, which indeed is a bad idea on large data sets, because it simply can't use any index at all.
In any case, why are you guessing on performance? You should profile and see what's better for yourself. Also, look at the SQL produced by each of the queries and look at their respective query plans.
Now, with that out of the way, the second query is better, simply because it grabs as much as it can during one query, thus removing a lot of the overhead. It isn't too noticeable if you're only querying two or three times, but once you get into say a hundred, you're in trouble. Apart from being better in the client-server communication, it's also better on the server, because it can use the index very effectively, rather than looking up X items one after another. Note that that's probably negligible for primary keys, which usually don't have a logarithmic access time.
Upvotes: 2