Matt Kocaj
Matt Kocaj

Reputation: 11535

count VS select in LINQ - which is faster?

I'm using IQueryable<T> interfaces throughout my application and defer execution of SQL on the DB until methods like .ToList()

I will need to find the Count of certain lists sometimes -without- needing to use the data in the list being counted. I know from my SQL experience that a SQL COUNT() is far less work for the DB than the equivalent SELECT statement that returns all the rows.

So my question is: will it be less work on the DB to return the count from the IQueryable<T>'s Count() method than rendering the IQueryable<T> to a list and invoking the list's Count() method?

I suspect it will given that the ToList() will fire the SELECT sql and then in a separate query count the rows. I'm hoping the Count() on the IQueryable<T> simply renders out the sql for a sql COUNT() query instead. But im not certain. Do you know?

Upvotes: 11

Views: 5312

Answers (3)

Slaggg
Slaggg

Reputation: 6481

If you're using SQL Server, Count() is still very expensive because it causes a table scan (or index scan, see comments on primary answer). And, by default Linq doesn't use the read uncomitted isolation level, which makes things worse due to locking.

If you can live with the result being a dirty result and an approximation of the total number of rows, the following code will be considerably faster than using Count(). In my experience, the value returned by this code is rarely different than the true count of rows.

/// <summary>A very fast method for counting rows in a table.</summary>
public static long FastRowCount(DataContext context, string tableName)
{
    const string template = "SELECT rowcnt FROM sys.sysindexes WHERE id = OBJECT_ID('{0}') AND indid < 2";
    string query = string.Format(template, tableName);
    return context.ExecuteQuery<long>(query).Single();
}

Upvotes: -1

Jero
Jero

Reputation: 159

I'm not sure if it's a hard and fast rule, but linq method you add to an Iqueryable will be added into the linq expression tree - unless they are one of the methods that actually cause the tree to be evaluated (like ToList and Single etc). In the case of LinqToSql you'll know if it can't convert something into the SQL statement because you'll get a runtime exception stating that the method is not supported.

eg

var something = dbContext.SomeTable
  .Select(c => c.col1 == "foo")
  .Distinct()
  .ToList()
  .Count()

In the above, Select() and Distinct() are included in the sql query passed to the server because they are added to an Iqueryable. Count() is just acting on the list that was returned by the sql query. So you don't want to do it that way :-)

In your case, Count() will definitely be faster that Select() because the resulting sql statement will indeed incorporate the count so the server only needs to return a single number rather than a list of rows.

Upvotes: -1

Jon Skeet
Jon Skeet

Reputation: 1503829

Calling ToList() will return a genuine List<T> with all the data, which means fetching all the data. Not good.

Calling Count() should indeed render the SQL to do the count on the database side. Much better.

The simplest way to check this, however, is to enable logging in your data context (or whatever the equivalent is for your particular provider) and see what queries are actually being sent.

Upvotes: 21

Related Questions