Benoittr
Benoittr

Reputation: 4341

Data Aggregation - Daily SQL Script vs Data Warehouse

Pardon me if this has already been asked (I know very little about Data Warehouse/BI and have yet to master the keywords).

I have a table that grow by more then 100 000 rows per day, each row having a timestamp and multiple information about an item (dimensions, weight,color,etc). Individual data can be useful for roughly a month after this period we are only interested in aggregations. I have a dedicated software that allow a more detailed visualisation of individual rows and mainly use PowerPivot for my reporting needs.

I could come up with an SQL query that would fill a new table daily: In which I would have a row for each hour/item/batch and I would summarize the information (sum/average/stddev/etc.)

Within a day my script would be up and running and I could use powerpivot against this new table. All this while staying where I'm comfortable: plain old SQL.

From the few information I gathered reading about DataWarehouse and BI, what I'm about to do sounds a lot like creating dimensions and facts. My question therefore: is it worthwhile to investigate further in that direction (BI) or since my problem is relatively simple I would do better staying in a relational database.

N.B. Reports that are being produced are usually linked against another database to produce more meaningful informations. Task that is very well accomplished by Powerpivot.

Upvotes: 6

Views: 974

Answers (3)

Neville Kuyt
Neville Kuyt

Reputation: 29639

Good news: it sounds like you already have a data warehouse. "Data warehouse" is a very generic term, with no real formal definition - it pretty much means whatever you want it to.

Commonly accepted characteristics are:

  • Data warehouses do not run on the operational databases
  • Data warehouses schemas are optimized for querying, not for "normal form" compliance
  • Data warehouses are populated by "Extract, Transform, Load" proceses (ETL).

It sounds like you're already doing all of that. If there are no business requirements to change, I'd leave it as it is. If your business users are asking to create their own queries, using different levels of aggregation, filtering, or granularit, a star schema may be the way to go.

Upvotes: 2

user359040
user359040

Reputation:

Datawarehouses are normally implemented in relational databases, so your existing skills will still be usable.

Given that you have expressed an interest in the dimension/fact table approach to datawarehousing, the canonical books on this approach are usually considered to be:

  • The Date Warehouse Toolkit (Kimball, Ross)
  • The Date Warehouse Lifecycle Toolkit (Kimball, Ross, Thornthwaite, Mundy, Becker)

(The former has more of a technical focus, while the latter approaches the subject from a wider lifecycle management viewpoint.)

Implementing DWHs can be time-consuming, so it may be worth continuing with your existing approach even if you decide to build a DWH.

Upvotes: 3

Stephen Senkomago Musoke
Stephen Senkomago Musoke

Reputation: 3523

The most effective solutions are those which are simple, adequate to meet existing needsand stay within available skillsets.

I agree that this approach works well for your situation an if it provides the reports and information you need then its worth starting this way. If you need more complex functionality later then you can go for more complex BI

Upvotes: 1

Related Questions