Reputation: 71
I hope I don't butcher the explanation of my question:
I've got a table that has hundreds of rows, each row is a recipe with nutritional information, for example:
recipe_table:
id | calories | protein| carbs | fat
recipe1, 100, 20g, 10g, 2g
recipe2, 110, 10g, 12g, 12g
recipe3, 240, 20g, 1g, 23g
....
I needed to create a new table (recipe_index) that would show every possible combination of every recipe in recipe_table as a set of 3, so it would look something like:
recipe_index:
id1 | id2 | id3 |calories| protein | carbs | fat
recipe1, recipe2, recipe3, 450, 50g, 23g, 37g
....
Basically it allows me to query recipe_index and say "what 3 recipe combinations come to a total value that's between 440 calories and 460 calories"
My current code for doing this works at 3 meals, however I end up with about 450,000 records in recipe_index, I need to do this same thing for 4,5 and 6 meals as well, so I'm calculating millions and millions of records at the end of this. Is there a more efficient way of doing this? Perhaps I need to look into partitioning a table for each range?
My current SQL code:
INSERT INTO recipe_index
SELECT distinct '3' as nummeals, t1.id as id1, t2.id as id2, t3.id as id3, 0 as id4,
t1.calories_ps+t2.calories_ps+t3.calories_ps as calories, t1.protein_ps+t2.protein_ps+t3.protein_ps as
protein, t1.carbohydrate_ps+t2.carbohydrate_ps+t3.carbohydrate_ps as carbohydrate,
t1.fat_ps+t2.fat_ps+t3.fat_ps as fat from recipes t1 inner join recipes t2 on t1.Id < t2.Id inner join recipes t3 on t2.Id < t3.Id WHERE t1.image <> '' AND t2.image <> '' AND t3.image <> ''
If I missed anything obvious please let me know
Upvotes: 0
Views: 521
Reputation: 1269483
You would do this with a join. In order to prevent duplicates, you want a condition where the recipe ids are in order (this also prevents one recipe from appearing three times):
select r1.id, r2.id, r3.id,
(r1.calories + r2.calories + r3.calories) as calories,
(r1.protein + r2.protein + r3.protein) as protein,
(r1.carbs + r2.carbs + r3.carbs) as carbs,
(r1.fat + r2.fat + r3.fat) as calories
from recipe_table r1 join
recipe_table r2
where r1.id < r2.id join
recipe_table r3
where r2.id < r3.id;
The only difference from your query is that the distinct
is not necessary, because the ordering prevents duplicates.
The problem you are facing is that there are a lot of combinations. So there are millions of combinations of 4 recipes. I'm guessing you are starting with 77 or so recipes. The number of combinations of 4 of them is 77*76*75*74 -- and this sequence will grow quickly for 5 and 6 combos.
Upvotes: 1