phgdanny
phgdanny

Reputation: 51

What's the optimal method to use for this database Modeling/Design?

I'm writing a program for kitchen/recipe management. As of now, the program can add new ingredients to the database. My next step is to add new recipes to the database, using the ingredients.

Since each recipe will have multiple ingredients, and an unlimited number of which can be added to a recipe, creating a separate column for each ingredient selected wouldn't be efficient (I'm assuming).

So the method I've come up with to store the ingredient selections is:

  1. Add a recipe_ingredients column to the recipe table in the db, and store the name of each ingredient in 1 single (text) field, separated by commas (CSV style).

  2. Write a PHP script to store the list of recipe_ingredients into an array.

  3. Do the same for all other attributes (recipe_ingredients_serving_size, recipe_ingredients_calories, etc.)

So the question is: Is this method optimal, and if not, what's a better way to go about this?

Upvotes: 2

Views: 172

Answers (2)

Zane Bien
Zane Bien

Reputation: 23125

This is a simple N:M(many-to-many) relationship, and your approach that you've laid out will likely spell out disaster in terms of efficiency and management.


Here is your situation:

  • You have two entities: recipes and ingredients.
  • One ingredient may be a part of many recipes.
  • One recipe may be made up of many ingredients.

Whenever you have this relationship between any two entities, you are going to want to have not two, but three tables:

+-----------+     +-------------------------+     +-------------------+
| recipes   |     | recipes_has_ingredients |     | ingredients       |
+-----------+     +-------------------------+     +-------------------+
| recipe_id |     | recipe_id               |     | ingredient_id     |
| name      |     | ingredient_id           |     | name              |
| ...       |     +-------------------------+     | calories          |
+-----------+                                     +-------------------+

recipes and ingredients are what's known as base tables, where they store intrinsic information about that particular entity.

The recipes_has_ingredients table is what's known as a cross-reference table (or "XREF"), which stores the associations between the two entities. The fields in this table: recipe_id and ingredient_id both link to their respective base tables, and the combination of the two in each row in the XREF table is unique. It basically maps the many associations that each recipe_id may have to different ingredient_id's and vice versa.

Why does this design facilitate many-to-many relationships? Because data in that XREF table is allowed to look like this:

+-----------------------------+
| recipe_id  |  ingredient_id |
+-----------------------------+
| 1          |  1             |
| 1          |  2             |
| 1          |  3             |
| 2          |  1             |
| 2          |  2             |
| 2          |  3             |
| 3          |  1             |
| 3          |  2             |
| 3          |  3             |
+-----------------------------+

As you can clearly see: One recipe is associated with many(3) ingredients, and one ingredient is associated with many(3) recipes. Also notice how values in either of the columns are allowed to repeat, but that the combination of the two columns is unique — this is really the key aspect of this design that makes the N:M relationship work.

So here are some simple examples of how you can easily retrieve and manage data using this design:

// Given a particular recipe_id, retrieve all ingredients used in that recipe:   

SELECT     name
FROM       recipes_has_ingredients
INNER JOIN ingredients USING (ingredient_id)
WHERE      recipe_id = <id>

// Retrieve the name of recipe (of id 4), and total amount of calories it has:

SELECT      a.name, 
            SUM(c.calories) AS calorie_count
FROM        recipes a
INNER JOIN  recipes_has_ingredients b ON a.recipe_id = b.recipe_id
INNER JOIN  ingredients c ON b.ingredient_id = c.ingredient_id
WHERE       a.recipe_id = 4
GROUP BY    a.recipe_id, 
            a.name

// Given a list of ingredient_id's, retrieve all recipes that contain 
// ALL of the listed ingredients

SELECT     name
FROM       recipes
INNER JOIN recipes_has_ingredients USING (recipe_id)
WHERE      ingredient_id IN (1,2,3)
GROUP BY   recipe_id
HAVING     COUNT(*) = 3

// Given a particular recipe_id (id 6), add two more ingredients 
// that it has (ids 4 & 9):

INSERT INTO recipes_has_ingredients VALUES (6,4), (6,9);

// Delete a particular recipe:

DELETE FROM recipe WHERE recipe_id = 4

^ The above DELETE operation also deletes all of that recipe's associations if you've properly defined the CASCADE rules between your relationships.


Looking back at your original design, what if you wanted to update or delete certain ingredients that a recipe has, or how about change the name of an ingredient? You would need hacky procedural code to modify the right positions in csv strings, or you would need to update every row in the table to reflect even the slightest changes in a single ingredient.

There are also many more compelling questions you could answer that you couldn't really otherwise using your original design such as:

  • Recipe with highest/lowest calorie count?
  • Ingredient that is contained by the most recipes?

...The list goes on, and the benefits of implementing this design will serve you well. You will prevent yourself from a great deal of hardship and suffering by doing things the right way. =)

Upvotes: 2

dianovich
dianovich

Reputation: 2287

This is not the ideal approach to solving this problem.

Comma-separating values in a column makes those values pretty useless - querying for recipes by ingredient would be painful if you just concatenate a bunch of ingredients together into a single string.

Also, how would you relate a single ingredients' calorie count to an ingredient if you're storing them as comma-separated strings?

The ideal solution would be to create a new table for ingredients, and store the name and associated information in that table.. e.g.

ingredients
-----------
id 
name
num_calories

There are many recipes with many ingredients, so create a table that links ingredients to a particular recipe, this makes the ingredients to recipes relationship many to one to many rather than a straight many to many.

recipe_ingredients
------------------
recipe_id
ingredient_id

You should read up on normalization here.

Upvotes: 2

Related Questions