manish
manish

Reputation: 20135

Possible options for storing hierarchical data

We need the ability to map various types of entities to different types of hierarchical data in our web application. As an example, consider an entity called Vendor. We need the ability to map each instance of vendor to geographical areas. Geographical areas are in the following hierarchy:

  1. Post code - most granular geographical area; example, EC2
  2. Locality - formed of post codes; example, Kensington. Each post code will be part of exactly one locality, no more.
  3. Town - formed of localities; example London. Each locality will be part of exactly one town, no more.
  4. District - formed of towns; example, Columbia.
  5. Province - formed of districts (equivalent to a state in some countries); example, South Carolina.
  6. Region - formed of provinces; example, Northeast Provinces.
  7. Country - formed of regions.
  8. Zone - formed of countries; example Southeast Asia.
  9. Continent - formed of zones.

We have a complete database of post codes, localities, towns, districts, provinces, regions, countries, zones and continents. This currently exists as tables in an RDBMS.

Our use cases:

  1. Ability to associate a Vendor with multiple geographies, at any level. For example, we could map Nestle to Mainland Europe (a zone), California (a US province) and Greater London (a district in UK).
  2. Ability to exclude some parts of a geography from the mapping. For example, when mapping Nestle to California, we may want to exclude San Diego.
  3. If the composition of a geography changes, no changes should be required to the mappings of which the geography is a part. For example, if a post code is added to Greater London, the mapping with Nestle should not require a change.
  4. Ability to query the database for a Vendor and a geography level. For example, if we query the database for Nestle and post codes, we should get all the post codes for Greater London, California (minus post codes for San Diego) and Mainland Europe. If we query the database for Nestle and countries, we should get UK (country for Greater London), US and all countries in Mainland Europe.

There are many such mappings with many different types of hierarchical data, this is just one of the requirements.

Looking for suggestions on storing the hierarchical data and the mappings. Please do not post answers that involve storing the data and mappings in RDBMS tables, as I am already aware of the options using RDBMS.

Upvotes: 0

Views: 665

Answers (1)

Joel Brown
Joel Brown

Reputation: 14388

I know that you are not looking for a RDBMS solution, since you "know the options for RDBMS" - but you don't state what options you have considered and why you are rejecting them. I believe there may be an RDBMS option you haven't considered which has the advantage of low data maintenance and ease of data retrieval.

I suggest that you group your geographical regions into a single table with an involuted relationship (adjacency model). This allows you to describe your vendor coverage as a list of geographical coverage records. The trick is to record vendor coverage at the highest possible level. You can then also list exclusions to the coverage.

Here is a suggested logical model:

Logical ERD

The GEOGRAPHY table could use nested sets to simplify the querying of the hierarchical geography data.

Determining vendor coverage of a particular area would be as easy as confirming that a vendor has COVERAGE for the GEOGRAPHY of interest (probably at the lowest level) while not also being in a COVERAGE EXCLUSION for that same GEOGRAPHY.

Upvotes: 2

Related Questions