Mathieu
Mathieu

Reputation: 4520

Is LINQ faster on a list or a table?

I have many queries to do and I was wondering if there is a significant performance difference between querying a List and a DataTable or even a SQL server indexed table? Or maybe would it be faster if I go with another type of collection?

In general, what do you think?

Thank you!

Upvotes: 1

Views: 1750

Answers (2)

KeithS
KeithS

Reputation: 71591

You might be confusing Linq with a database query language. I would suggest reading up on Linq, particularly IQueryable vs IEnumerable.

In short, Linq is an in-code query language, which can be pointed at nearly any collection of data to perform searches, projections, aggregates, etc in a similar fashion as SQL, but not limited to RDBMSes. It is not, on its face, a DB query language like SQL; it can merely be translated into one by use of an IQueryable provider, line Linq2SQL, Linq2Azure, Linq for Entities... the list goes on.

The IEnumerable side of Linq, which works on in-memory objects that are already in the heap, will almost certainly perform better than the IQueryable side, which exists to be translated into a native query language like SQL. However, that's not because of any inherent weakness or strength in either side of the language. It is instead a factor of (usually) having to send the translated IQueryable command over a network channel and get the results over same, which will perform much more slowly than your local computer's memory.

However, the "heavy lifting" of pulling records out of a data store and creating in-memory object representations has to be done at some time, and IQueryable Linq will almost certainly be faster than instantiating ALL records as in-memory objects, THEN using IEnumerable Linq (Linq 2 Objects) to filter to get your actual data.

To illustrate: You have a table MyTable; it contains a relatively modest 200 million rows. Using a Linq provider like Linq2SQL, your code might look like this:

//GetContext<>() is a method that will return the IQueryable provider
//used to produce MyTable entitiy objects

//pull all records for the past 5 days
var results = from t in Repository.GetContext<MyTable>()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

This will be digested by the Linq2SQL IQueryable provider into a SQL string like this:

SELECT [each of MyTable's fields] FROM MyTable WHERE SomeDate Between @p1 and @p2; @p1 = '2/26/2011', @p2 = '3/3/2011 9:30:00'

This query can be easily digested by the SQL engine to return EXACTLY the information needed (say 500 rows).

Without a Linq provider, but wanting to use Linq, you may do something like this:

//GetAllMyTable() is a method that will execute and return the results of
//"Select * from MyTable"

//pull all records for the past 5 days
var results = from t in Repository.GetAllMyTable()
              where t.SomeDate >= DateTime.Today.AddDays(-5)
              && t.SomeDate <= DateTime.Now
              select t;

On the surface, the difference is subtle. Behind the scenes, the devil's in those details. This second query relies on a method that retrieves and instantiates an object for every record in the database. That means it has to pull all those records, and create a space in memory for them. That will give you a list of 200 MILLION records, which isn't so modest anymore now that each of those records was transmitted over the network and is now taking up residence in your page file. The first query MAY introduce some overhead in building and then digesting the expression tree into SQL, but it's MUCH preferred over dumping an entire table into an in-memory collection and iterating over it.

Upvotes: 4

Adam Rackis
Adam Rackis

Reputation: 83376

It should almost always be faster querying anything in memory, like a List<T> or a DataTable vis-a-vis a database.

Having said that, you have to get the data into an in-memory object like a List before it can be queried, so I certainly hope you're not thinking of dumping your DB into a List<T> for fast querying. That would be a very bad idea.

Am I getting the point of your question?

Upvotes: 5

Related Questions