klausch
klausch

Reputation: 662

Modeling Region or Country with single reference

I am developing a website for a dealer of wine and other alcoholic beverages. Obviously, each wine is made in a country that must be modeled in the Wine table. But many times, a Wine also has a Region (Languedoc, Rioja, Bougogne etc.), these regions are of course in a parent-child relationship with a country. THe following options exist:

-Giving the Wine table only a reference to a region Problem is that some wines/whiskeys do not mention a region, only a country

-Giving the wine table 2 separate FK references, to a Country and a Region table. This introduces a circular reference and a redundacny problem becase country and region are already related.

-Using a Location table and a single FK referernce from the Wine table to the Location table. THe Location table is in fact a region or a country (maybe even a city) so it has a field "location_type" and a parent FK field, referring to its own PK. For the top-level Country entries, the parent id is null. This is the example I have found somewhere in the internet. It will make however the queries more complex.

Is this a known problem, and are there any suggestions? TIA, Klaas

Upvotes: 1

Views: 227

Answers (3)

Hank
Hank

Reputation: 11

Good point about creating a location dimension. This type of model addresses analysis more effectively, but is more complicated for transactional systems. This gets into the question of whether you're optimizing your model to handle CRUD-type transactions, or for aggregated data analysis.

On the whole, I assume that Klaus is looking at modeling for a transactional system with basic query, rather than an analysis-based application like a data warehouse.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270371

I think you might want to look at this from a dimensional analysis perspective, rather than a strict entity-relationship one. That is, a but of denormalization may be just what you are looking for. I would recommend Ralph Kimball's books on dimensional data warehouses, since they often solve this type of problem.

In your case, just create a "location" dimension that contains all the fields that you might be interested in, at the lowest level of granularity:

  • Region
  • Country
  • SubCountry

Are the two obvious ones. You might also have hillside, city, whatever.

To take an example, you would have the following rows:

  1. Barolo/Italy/Piemonte
  2. NULL/Italy/Piemonte
  3. NULL/Italy/NULL

The wines would be connected to this table.

Now, you have a maintenance problem for this table. However, the universe of wines and official regions is well known and very slowly changing, so I don't see this as a problem.

Upvotes: 0

Hank
Hank

Reputation: 11

I'm also working on an application in this domain. Common for wine, there is also the concept of sub-region or appellation, so you can have wines from France-Bourgogne-Cote d'Or, for example. I went with the second option you described, having FK references to Country, Region, and Subregion. Only the Country field is required, while the others are nullable. The potential issues with referential integrity are compounded with this model, but it greatly facilitates effective query based on these fields, which is kind of the point of capturing this information in the first place.

Upvotes: 1

Related Questions