Reputation: 41
A database has a transactions
table with columns: account_id
, date
, transaction_value
(signed integer). Another table (account_value
) stores the current total value of each account, which is the sum of all transaction_value
s per account. It is updated with a trigger on the transactions
table (i.e., INSERTs, UPDATEs and DELETEs to transactions
fire the trigger to change the account_value
.)
A new requirement is to calculate the account's total transaction value only over the last 365 days. Only the current running total is required, not previous totals. This value will be requested often, almost as often as the account_value
.
How would you implement this "sliding window sum" efficiently? A new table is ok. Is there a way to avoid summing over a year's range every time?
Upvotes: 4
Views: 4719
Reputation:
This can be done with standard windowing functions:
SELECT account_id,
sum(transaction_value) over (partition by account_id order by date)
FROM transactions
The order by
inside the over()
claues makes the sum a "sliding sum".
For the "only the last 356 days" you'd need a second query that will limit the rows in the WHERE clause.
The above works in PostgreSQL, Oracle, DB2 and (I think) Teradata. SQL Server does not support the order by in the window definition (the upcoming Denali version will AFAIK)
Upvotes: 2
Reputation: 86765
I'll avoid any actual SQL here as it varies a lot depending on the variety of SQL that you are using.
You say that you have a trigger to maintain the existing running total.
I presume that it also (or perhaps a nightly process) creates new daily records in the account_value
table. Then INSERTs, UPDATEs and DELETEs fire the trigger to add or subtract from the existing running total?
The only changes you need to make are:
- add a new field, "yearly_value" or something
- have the existing trigger update that in the same way as the existing field
- use gbn's type of answer to create today's records (or however far you backdate)
- but initialise each new daily record in a slightly different way...
When you insert a new row for a new day, it should be initialised to yesterday's value
- the value 365 days ago
. After that, the behavior should be identical to what you're already used to.
Upvotes: 0
Reputation:
You are going to need a one-off script to populate the existing table with values for the preceding year for each existing record - that will need to run for the whole of the previous year for each record generated.
Once the rolling year column is populated, one alternative to summing the previous year would be to derive each new record's value as the previous record's rolling year value, plus the transaction value(s) since the last update, minus the transaction values between one year prior to the last update and one year ago from now.
I suggest trying both approaches against realistic test data to see which will perform better - I would expect summing the whole year to perform at least as well where data is relatively sparse, while the difference method may work better if data is to be frequently updated on each account.
Upvotes: 0
Reputation: 6246
If queries of the transactions table are more frequent than inserts to the transactions table, then perhaps a view is the way to go?
Upvotes: 1
Reputation: 432431
As simple as this?
SELECT
SUM(transaction_value), account_id
FROM
transactions t
WHERE
-- SQL Server, Sybase t.DATE >= DATEADD(year, -1, GETDATE())
-- MySQL t.DATE >= DATE_SUB(NOW(), INTERVAL 12 MONTH)
GROUP BY
account_id;
You may want to remove the time component from the date expressions using DATE (MySQL) or this way in SQL Server
Upvotes: 1