Reputation: 1258
I have a table with a lot of rows (3 million) from which I need to query some rows at several points in my app. The way I found to do this is querying all the data the first time that any was needed and storing it in a static DataTable
with SqlAdapter.Fill()
for the rest of the app life.
That's fast, because then when I need something I use DataTable.Select("some query")
and the app processes the info just nice.
The problem is that this table takes about 800MB of RAM, and I have to run this app in PCs where it might be too much.
The other way I thought was to query the data I need each time. This takes little memory but has poor performance (a lot of queries to the database, which is at a network address and with 1000 queries you start to notice the ping and all that..).
Is there any intermediate point between performance and memory usage?
EDIT: What I'm retrieving are sales, which have a date, a product and a quantity. I query by product, and it isn't indexed that way. But anyways, making 1000 queries, even if the query took 0.05s, a 0.2s ping makes a total of 200 seconds...
Upvotes: 2
Views: 8458
Reputation: 45096
First talk to the dba about performance
If you are downloading the entire table you might actually be putting more load on the network and SQL than if you performed individual queries.
As a dba if I knew you were downloading an entire large table I would put an index on product immediately.
Why are you performing 1000s of queries?
If you are looking for sales when a product is created then a cache is problematic. You would not yet have sales data. The problem with a cache is stale data. If you know the data will not change - you either have it or not then you can eliminate the concern of stale data.
There is something between sequentially and simultaneously. You can pack multiple selects in a single request. What this does is make a single round trip and is more efficient.
select * from tableA where ....;
select * from tableB where ....;
With DataReader just call SqlDataReader.NextResult Method ()
using (SqlDataReader rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
}
rdr.NextResultSet();
while (rdr.Read())
{
}
}
Pretty sure you can do the same type of thing with multiple DataTables in a DataSet.
Another option is a LocalDB. It is targeted at developers but for what you are doing it would work just fine. DataTable speed without memory concerns. You can even put an index on ProductID. It will take a little longer to write to disc compared to memory but you are not using up memory.
Then there is the ever evil with (nolock). Know what you are doing and I am not going to go into all the possible evils but I can tell you that I use it a lot.
Upvotes: 4
Reputation: 14379
The question can be precipitated to Memory vs Performance. The answer to that is Caching.
If you know what your usage pattern would be like, then one thing you can do is to create a local cache in the app.
The extreme cases are - your cache size is 800MB with all your data in it (thereby sacrificing memory) - OR - your cache size is 0MB and all your queries go to network (thereby sacrificing performance).
Three important questions about the design of the cache are answered below.
How to populate the Cache?
Effectively the idea is that if you know some information may be needed in future, cache it beforehand.
How to free the Cache?
You can decide the freeing mechanism for cache either Actively or Passively.
Passively: Whenever cache is full you can evict the data from it.
Actively: Run a background thread at regular interval and it takes care of removal for you.
One hybrid method is to run a freeing thread as soon as you reach, let's say, 80% of your memory limit and then free whatever memory you can.
What data to remove from the Cache?
This has been answered already in context of the issue of Page Replacement Policies for Operating Systems.
For completion, I'll summarize the important ones here:
Upvotes: 3
Reputation: 186
RE: "I can't index by anything because I'm not the database admin nor can ask for that."
Can you prepopulate a temp table and index on that?, e.g.
Select * into #MyTempTable from BigHugeTable
Create Index Prodidx on #MyTempTable (product)
You will have to ensure you always reuse the same connection (and it isn't closed) in order to use the temp table.
Upvotes: 2