Reputation: 38
Our situation is that we are going to start to build a data warehouse. The data warehouse is going to take some time, if we are going to do it right. It will be built looking at individual processes and growing from there.
We only have three databases that we will be pulling data from. All three databases hold distinct information (financial info, scheduling and patient information - visits, diagnosis,etc).
I am thinking of using a dashboard/reporting tool like (as an example) http://www.jedox.com/en/, or http://www.board.com/us/ to display the information to the business. It will slowly start incoperating the DW as it is beind designed and pushed to production.
My question after all this is: What is the best way to present the data to the application (dashboard/reporter) in the backend that would be efficient, yet not time consuming where I'd rather build the Data Warehouse? Ie. views, materialized views, small seperate DB containing subset data from the main DB's, etc?
Upvotes: 1
Views: 269
Reputation: 16240
This may not be answering your question directly, but rather than find a temporary solution I would just build your warehouse faster.
First, if you can build it quickly then you don't need a temporary one; if you can't build it quickly then you won't be able to build a temporary solution quickly either. You even mentioned developing a "small separate DB containing subset data"; that's exactly what a reporting database is!
Second, any temporary solution will have to be maintained and supported too: if it's too useful then your temporary solution will become your permanent one anyway. That might actually be a good thing because if the 'temporary' solution meets your requirements then why not keep it?
Anyway, I would start by identifying one or two key reports that have high value for your users and commit to delivering them in 2 months (1 month would be even better). Develop the most basic, minimal database and ETL/reporting processes possible to deliver those reports, even if it seems like a horrible, hacked-together mess. Make sure the reports are internal ones that no one will send to an outside customer; that means you can avoid spending time on making them pretty.
After you've delivered those reports, you can now step back and look at what you did. Hopefully you will find yourself in a position where:
If #1 and #2 are true then you'll have delivered a lot of business value quickly while also setting the user expectation that correct is often more valuable than pretty (that's really helpful on a reporting project). If #3 and #4 are true then your second iteration will be a big improvement on the first one and even if you find yourself in the worst case scenario of having to re-develop the whole thing from scratch, you'll do it faster and better because you've learned so much.
This is simply agile development, of course: there's no reason you can't use rapid prototyping and incremental delivery in a data warehouse project. Like any IT solution the warehouse will continuously grow and be maintained over time so there's absolutely no reason to try to get everything complete and correct in the first version. It's highly likely that your users don't even really know what they want (in detail) so this approach helps to clarify their expectations and requirements more quickly too.
Upvotes: 2