Reputation: 3171
I have a very general question regarding the use of LINQ vs SQL to filter a collection. Lets say you are running a fairly complex filter on a database table. It's running, say 10,000 times and the filters could be different every time. Performance wise, are you better off loading the entire database table collection into memory and executing the filters with LINQ, or should you let the database handle the filtering with SQL (since that's what is was built to do). Any thoughts?
EDIT: I should have been more clear. Lets assume we're talking about a table with 1000 records with 20 columns (containing int/string/date data). Currently in my app I am running one query every 1/2 hour to pull in all of the data into a collection (saving that collection in the application cache) and filtering that cached collection throughout my app. I'm wondering if that is worse than doing tons of round trips to the database server (it's Oracle fwiw).
Upvotes: 5
Views: 4036
Reputation: 176906
EDIT
Its alwyas depends on the amount of data you have. If you have large amount data than go for sql and if less than for the linq. its also depends on the how frequently calling the data from sql server it its too frequently than its better to load in memory and than apply linq but if not than sql is better.
First Answer
Its better to go on sql side rather than loading in memory and than apply linq filter.
The one reason is better to go for sql rather an linq is
if go for linq when you are getting 10,000 record it loads in memory as well as increase the nework traffic
if go for sql no of record decreses so amount of memory utilise is less and aslo decrease networ traffic.
Upvotes: 2
Reputation: 66641
I will say that is far better to let SQL do the complex filter and rest of processing, but why you may ask.
The main reason is because SQL Server have the index information's that you have set and use this index to access data very fast. If you load them on Linq then you do not have this index information for fast accessing the data, and you lose time to access them. Also you lose time to compile the linq every time.
You can make a simple test to see this different by your self. What test ? Create a simple table with hundred random string, and index this field with the string. Then make search on string field, one using linq and one direct asking the sql.
My first thinking was that the SQL keep the index and make very quick access to the search data base on your SQL.
Then I think that linq can also translate this filter to sql and then get the data, then you make your action etc...
now I think that the actually reason is depend what actions you do. Is faster to run direct the SQL, but the reason of that is depend on how you actually set your linq.
If you try to load all in memory and then use linq then you lose for speed from SQL index, and lose memory, and lose a lot of action to move your data from sql to memory.
If you get data using linq, and then no other search need to be made, then you lose on the moving of all that data on memory, and lose memory.
Upvotes: 0
Reputation: 273274
After the update:
It's running, say 10,000 times and
I'm going to assume a table with 1000 records
It seems reasonable to assume the 1k records will fit easily in memory.
And then running 10k filters will be much cheaper in memory (LINQ).
Using SQL would mean loading 10M records, a lot of I/O.
Upvotes: 2
Reputation: 7438
t depends on the amount of data you are filtering on.
You say the filter runs 10K time and it can be different everytime, in this case if you don't have much data in database you can load that on to server variable.
If you have hundred thousands of records on database that you should not do this perhaps you can create indexes on database and per-compiled procedures to fetch data faster.
You can implement cache facade in between that helps you to store data in server side on first request and update it as per your requirement. (you can write the cache to fill variable only if data has limit of records).
You can calculate time to get data from database by running some test queries and observations. At the same time you can observer the response time from server if the data is stored in memory and calculate the difference and decide as per that.
There can be many other tricks but the base line is
You have to observer and decide.
Upvotes: -1
Reputation: 4702
This will probably cause some debate on the role of a database! I had this exact problem a little while back, some relatively complex filtering (things like "is in X country, where price is y and has the keyword z) and it was horrifically slow. Coupled with this, I was not allowed to change the database structure because it was a third party database.
I swapped out all of the logic, so that the database just returned the results (which i cached every hour) and did the filtering in memory - when I did this I saw massive performance increases.
Upvotes: 0
Reputation: 13655
Depends on how big your table is and what type of data it stores.
Personally, I'd go with returning all the data if you plan to use all your filters during the same request.
If it's a filter on demand using ajax, you could reload the data from the database everytime (insuring by the same time your data is up to date)
Upvotes: 1