The difference between modeling hierarchies in normalized and dimensional data modeling

Question

I have trouble understanding the differences between modeling hierarchies in normalized and dimensional data modeling? I will very much appreciate your time and answers.

Neil McGuigan · Accepted Answer

In a normalized schema for a transactional database, a hierarchy might look like this:

create table geopolitical_area (
  id bigint primary key,
  type text not null,
  name text not null,
  code char(2),
  parent_id bigint null references geopolitical_area(id)  
);

insert into geopolitical_area values
(1, 'Country', 'Canada', 'CA', null),
(2, 'Region', 'British Columbia', 'BC', 1);

Note the foreign key to the same table.

In a dimensional schema for a data warehouse, a hierarchy might look like this:

create table dim_customer (
  id bigint,
  name text,
  country_name text,
  country_code char(2),
  region_name text,
  region_code char(2) ...
);

insert into dim_customer values
(666, 'Neil McGuigan', 'Canada', 'CA', 'British Columbia', 'BC' ...);

Notice that the hierarchy has been flattened.

Often, you analyze a data warehouse using OLAP, where you would tell your OLAP server about the country > region hiearchy.

For example, in Mondrian OLAP server, a hierarchy might look like this:

The difference between modeling hierarchies in normalized and dimensional data modeling

Answers (1)

Related Questions