Mutuelinvestor
Mutuelinvestor

Reputation: 3538

Is there a dimension modeling design pattern for multi-valued dimensions

I'm working on a data warehouse that seeks to capture website visits and purchase. We have a hypothesis that by identifying patterns from previous site visits you can get insights into visitor behavior for the current site visit

The grain of my fact table is individual website visits and we assign a 1 if the customer makes a purchase and a 0 if she does not. Our fact is additive. We would like to be able explore and understand how the actions of prior visits influence the action of the current visit so I'm trying to figure out how you would go about modeling this. On a particular site visit a visitor could have 1, 2 or 12 prior site visits.

So my question is how would I model a past visit dimension that includes the past visit date, past visit activity (purchase or no purchase, time on site, etc). Is this an example of a use for a bridge table.

Upvotes: 1

Views: 292

Answers (1)

Max xaM
Max xaM

Reputation: 348

A bridge table in a data-warehouse is primarily (exclusively?) for dealing with many to many relationships, which you don't appear to have.

If the grain of your fact table is website visits then you don't need a 'past visit' dimension, since your fact table contains the visit history already.

You have two dimensions here: Customer Date

Time on site is presumably a number, and since you are treating purchase/no purchase as a boolean score (1,0) these are both measures and belong in the fact table.

The Customer dimension is for your customer attributes. Don't put measures here (e.g. prior scores). You should also consider how to handle changes (probably SCD type 2).

You could put your date field directly in the fact table but it is more powerful as a separate dimension, since you can much more easily analyze by quarters, financial years, public holidays etc.

So,

Example Fact_Website_Visit table:

Fact_Website_Visit_Key | Dim_Customer_Key | Dim_Date_Key | Purchase(1,0) | Time_On_Site

Example Dim_Customer Dimension:

Dim_Customer_Key | Customer_ID | Customer_Demographic

Example Dim_Date Dimension:

Dim_Date_Key | Full_Date | IsWeekend

To demonstrate how this works I've written an example report to see sale success and average time spent online on weekends grouped by customer demographic:

SELECT
Dim_Customer.demographic,
COUNT(fact.Fact_Website_Visit_Key) AS [# of Visits],
SUM (fact.Purchase) AS [Total Purchases],
AVG (fact.Time_On_Site) AS [Average Minutes Online],
SUM (fact.Purchase)/COUNT(fact.Fact_Website_Visit_Key)*100 AS [% sale success]
FROM
Fact_Website_Visit fact
INNER JOIN Dim_Customer ON fact.Dim_Customer_Key=Dim_Customer.Dim_Customer_Key
INNER JOIN Dim_Date ON fact.Dim_Date_Key=Dim_Date.Dim_Date_Key
WHERE
Dim_Date.IsWeekend='Y'
GROUP BY
Dim_Customer.Demographic

Upvotes: 1

Related Questions