Create an in-memory readonly cache for data stored in SQL Server

Question

I have a problem concerning application performance: I have many tables, each having millions of records. I am performing select statements over them using joins, where clauses and orderby on different criterias (specified by the user at runtime). I want to get my records paged but no matter what I do with my SQL statements I cannot reach the performance of getting my pages directly from memory. Basically the problem comes when I have to filter my records by using some runtime dynamic specified criteria. I tried everything such as using ROW_NUMBER() function combined with a "where RowNo between" clause, I've tried CTE, temp tables, etc. Those SQL solutions performs well only if I don't include filtering. Keep in mind also that I want my solution to be as generic as possible (imagine that i have in my app several lists that virtually presents paged millions of records and those records are constructed with very complex sql statements).

All my tables has a primary key of type INT.

So, I come with an ideea: Why not create a "server" only for select statements. The server loads first all records from all tables and stores them into some HashSets where each T has an Id property and GetHashCode () returns that Id and also the Equals is implemented such that two records are "equal" only if Id is equal (don't scream, You will see later why I am not using all record data for hashing and comparisons).

So far so good, but there's a problem: How can I sync my in memory collections with database records?. The ideea is that I must find a solution such as I load only differential changes. So I invented a changelog table for each table that I want to cache. In this changelog I perform only inserts that marks dirty rows (updates or deletes) and also records newly inserted ids, all of this mechanism implemented using triggers. So whenever an in-memory select comes, I check first if I must sync something (by interogating the changelog). If something must be applied, I load the changelog, I apply those changes in memory and finally I am clearing that changelog (or maybe remember what was the highest changelog id that I've applied ...).

In order to be able to apply the changelog in O ( N ) where N is the changelog size, i am using this algo:

for each log.
identify my in-memory Dictionary  where the key is the primary key.

if it's a delete log then call dictionary.Remove (id) ( O ( 1 ))
if it's an update log, then call also dictionary.Remove (id) ( O (1))  and move this id into an "to be inserted" collection
if it's an insert log, move this id into a  "to be inserted" collection.

 finally, refresh cache by selecting all data from the corresponding table where Id in ("to be inserted").

For filtering, I am compiling some expression trees into Func < T, List < FilterCriterias >, bool > functors. Using this mechanism I am performing way more faster than SQL.

I Know that SQL 2012 has caching support and the new comming SQL version will suport even more but My client have SQL server 2005 so ... I can't benefit of this stuff.

My question: What do you think ? this is a bad ideea ? there's a better aproach ?

Create an in-memory readonly cache for data stored in SQL Server

Answers (1)

Related Questions