hambone
hambone

Reputation: 21

Implementing many-to-many with one "primary" value

I have many products that can each be in many categories.

products: id, ...
products_categories: product_id, category_id
categories: id, ...

Now I want to have many products, each with one master category, and 0 or more secondary categories. I can think of two ways to model this in SQL.

  1. Add an is_primary column to products_categories OR

  2. Add a primary_category_id column to products

What is the best way to implement this in pure SQL and/or ActiveRecord? I'm using PostgreSQL, for what it's worth.

Upvotes: 2

Views: 176

Answers (4)

Andriy M
Andriy M

Reputation: 77657

What are each way's pros and cons?

Option 1. I can be sure that the primary category for a product is indeed one of its categories. But there may be a problem of ensuring that a product has no more than one primary category.

Option 2. This lets me make sure that a product has only one primary category. But then I don't seem to have a way to make sure that it's one of this same product's categories.

So, I would probably go for a third option, using a table Products_PrimaryCategories:

Products_PrimaryCategories: product_id, category_id

It seems the same as product_categories, but has some additional properties:

  • product_id has an associated unique index, making sure you can only have one primary category for each product;

  • (product_id, category_id) is a foreign key referencing products_categories (product_id, category_id) ensuring that a product's primary category is one of its categories (which implies that (product_id, category_id) should be products_categories's primary key).

Upvotes: 0

Aleksi Torhamo
Aleksi Torhamo

Reputation: 6632

It really depends on the exact details of what you're trying to accomplish. Here are some of the things to consider while deciding what's best for you. Other answers already tackled the first case, so I'm going to focus on the second one.

If you have primary_category_id:

  • It seems cleaner to have one field in product that tells which category is the primary one, than to have a field in every product_category which has 1 in one row and 0 in every other row, although the suggestion by M.R. to use association_type sounds clean too - but what's the chance you're going to have "tertiary" categories?
  • It's slightly easier to get to the primary category
  • It's easy to ensure every product always has a primary category (just make the field NOT NULL)
  • It automatically enforces that a product may only have one primary category
  • Should you also insert the primary category to products_categories?
    • Neither option is enforced.
    • If you don't, it's awkward to query all the categories
    • If you do, it's still easy to query, but without additional work, nothing guarantees the primary category is also inserted in the other table

If you use the is_primary method, you should somehow ensure that every product always has exactly one primary category.

Upvotes: 1

M.R.
M.R.

Reputation: 4827

I would go with option (1). The reason for this is since your products can belong to more than one category, the relationship attribute (that its a 'primary' category) belongs in the table that defines the relationship.

I would even go further and suggest that instead of labeling the field 'is_primary', you should have the field labeled as 'association_type'. And instead of just adding a bit field, make it an integer field, and have all the association types defined. In your case today, there are only two association types - secondary and primary. The advantage is that this design is much more scalable. If tomorrow, you are asked to define a 'primary', a 'secondary' and all other tertiary categories, this design will be able to handle it, instead of having to add another field to designate the 'secondary' field.

Upvotes: 1

Jim
Jim

Reputation: 61

I would go with the first option unless I have a good reason for choosing 2 (like the cost of an extra join when getting the primary category)

reason: you probably need to add the primary category to product_category table anyway (in order to use it in a uniform and simple way in queries like getting all categories for a product) option 1 avoids duplicating primary category thus simpler

Upvotes: 1

Related Questions