DreamingFighter
DreamingFighter

Reputation: 41

How to model a dimension table that link to several facts with different level of grain?

I have a fact that store client's address. Problem is, the client can choose to insert information at state level, or county level, or street level. In the operation database, there is 1 table for streets, link to another table for counties, link to another table for states. The client table has 1 column for state, 1 column for county, 1 column for street that contain ID (so can link to higher object in the hierarchy) How can I model the relationship between the fact and the dimension in a star-schema?

So I created one Location dimension with all states, all counties, all streets. The table look like this:


DIM_ID  | Level     | Street columns        | County columns        | State columns
1       | Street    | Bolsa                 | Westminton            | California
2       | County    | Westminton [county]   | Westminton            | California
3       | State     | [State of] California | [State of] California | California

If client disclose street then fact record link to row 1, client disclose county level then fact record link to row 2, client disclose only state then fact record link to row 3.

What do you think of that approach?

Upvotes: 1

Views: 752

Answers (1)

Chris Mack
Chris Mack

Reputation: 5208

I would probably model these levels separately, as they are being treated as separate, i.e.:

  • dim.Street
  • dim.County
  • dim.State

As for relating these to clients, I'd go for bridge tables, e.g.:

CREATE TABLE ClientStreet
(
    ClientID
    , StreetID
)

Etc.

And if you cannot provide a Street without providing a County and State, or provide a County without providing a State, I would have within dim.Street, CountyID, and within dim.County, StateID, i.e. a hierarchical structure.

EDIT

With regards to your client dimensions with 3 IDs, this could also be a good model.

With regards to my hierarchical structure, and data modelling in general, I feel the way you model it really needs to:

  1. Reflect reality (e.g., recording a client's Street as "[State of] California" seems inaccurate to me).
  2. Reflect the reality of what is possible in terms of incoming data, i.e. do your clients input their address once, or can they change it (do you want to track changes?)?

One thing I'm wondering is if whether your clients have to pick one and only one of these levels to record their address at. In this case, I'd have either the model I'd suggested above, or I'd have client dimension with 3 IDs, and a CHECK CONSTRAINT to ensure that only one of these was ever populated. This would then be supported be the fact that a Street would have a CountyID, etc. So you would determine this kind of "channel" by which ID is populated in the client dimension.

Upvotes: 1

Related Questions