Reputation: 165
I have the following scenario:
And this my problem: I need to make sure the reports do not contain partial data:
What is the best mechanism to get this? Database locks do not seem to be intelligent enough to manage the priorities I need.
Should I implement my own locking mechanism? If yes, is there a well-known design for it? Many things must be taken into account: keep track of currently running reports (lock-for-reads), implement lock expiration for cases when the ETL fails notifying it finished, etc.
Upvotes: 2
Views: 1315
Reputation: 245
You could make the etl process set a flag in a record in a table when it starts and unset it when it completes? The Reporting system can then query this table to see if the flag is set and notify the user that the warehouse is running and either put the report in a queue that is processed on completion of the ETL process or tell the user to try later. If you want to allow users to use facts that have been processed, your flag table would need to be at the table grain - i.e. on flag record per table.
Upvotes: 0
Reputation: 6819
If you are using Cognos, then I think you're basically out of luck from a "prevent report from running if ETL is running" kind of set up, unless you want to muck about in badly-documented APIs.
Your best bet is to probably identify the specific reports, usually ones that are run against aggregates, and make sure that you set up your ETL process to update the facts and aggregates last, and as one big "update" transaction. If you use a dbms that gives you read-consistency, you should be able to do this without a report showing up with only half the data loaded to it.
Reports that access multiple facts / multiple aggs will be more troublesome. It may even be that you may have to set up some kind of "table swap" where you build what you need then alter table rename
to swap out the tables.
Upvotes: 1