Reputation: 191
What's a common/best practice for database design when it comes to improving performance on count(1)
queries? (I'm currently using SQLite)
I've normalized my data, it exists on multiple tables, and for simple things I want to do on a single table with a good index -- queries are acceptably quick for my purposes.
eg:
SELECT count(1) from actions where type='3' and area='5' and employee='2533';
But when I start getting into multiple table queries, things get too slow (> 1 second).
SELECT count(1)
from
(SELECT SID from actions
where type='3' and employee='2533'
INTERSECT
SELECT SID from transactions where currency='USD') x;
How should I cache my results? What is a good design? My natural reaction is to add a table solely for storing rows of cached results per employee?
Upvotes: 1
Views: 1655
Reputation: 107317
Edit
Design patterns like Command Query Responsibility Segregation
(CQRS) specifically aim to improve the read side
performance of data access, often in distributed systems and at enterprise scale.
Another pattern commonly associated with CQRS is "Event Sourcing", which stores, and then allows 'replay' of Commands, for various use cases.
The above may be overkill for your scenario, but a very simple implementation of caching at an internal app level, could be via a Sqllite Trigger
Assuming that there are many more 'reads' than writes to your actions
or transactions
tables,
action
or transactions
tables update. One cheap (and nasty) way would be to provide an INSERT, UPDATE and DELETE trigger on the action
and transactions
table, which would then update the appropriate cache table(s).In addition to a local relational database like SqlLite
, NoSql databases like MongoDb, Cassandra and Redis
are frequently used as alternatives to read side caching in read-heavy environments (depending on the type and format of data that you need to cache). You would however need to handle alternative to synchronize data from your 'master' (e.g. SQLLite) database to these cache read stores - triggers obviously won't cut it here.
Original Answer
If you are 100% sure that you are always repeating exactly the same query for the same customer, sure, persist the result.
However, in most other instances, RDBMS usually handles caching just fine.
The INTERSECT with the query
SELECT SID from transactions where currency='USD'
Could be problematic if there are a large number of transaction records with USD.
Possibly you could replace this with a join?
SELECT count(1) from
(
SELECT t.[SID]
from
transactions as t
inner join
(
SELECT SID from actions where type='3' and employee='2533'
) as a
on t.SID = a.SID
where t.currency= 'USD'
) as a
You might just check your indexes however:
For
An index on Actions(Employee, Type)
or Actions(Employee, Type, Area)
would make sense (assuming Employee has highest selectivity, and depending on the selectivity of Type and Area).
You can also compare this to an index on Actions(Employee, Type, Area, SID) as a covering index for your second query.
And for the join above, you need an index on Transactions(SID, Currency)
Upvotes: 1