Reputation: 35
I have a Pandas Dataframe that stores a food item on each row in the following format -
Id Calories Protein IsBreakfast IsLunch IsDinner
1 300 6 0 1 0
2 400 12 1 1 0
.
.
.
100 700 25 0 1 1
I want to print all three-row combinations with the following conditions -
Right now, I am first iterating over all breakfast items, choosing lunch items. Then iterating over all dinner items. After selecting a combination, I am adding relevant columns and checking if values are within the desired range
Upvotes: 3
Views: 2779
Reputation: 8269
You can use the approach described in this answer to generate a new DataFrame containing all the combinations of three rows from your original data:
from itertools import combinations
import pandas as pd
# Using skbrhmn's df
df = pd.DataFrame({"Calories": [100, 200, 300, 400, 500],
"Protein": [10, 20, 30, 40, 50],
"IsBreakfast": [1, 1, 0, 0, 0],
"IsLunch": [1, 0, 0, 0, 1],
"IsDinner": [1, 1, 1, 0, 1]})
comb_rows = list(combinations(df.index, 3))
comb_rows
Output:
[(0, 1, 2),
(0, 1, 3),
(0, 1, 4),
(0, 2, 3),
(0, 2, 4),
(0, 3, 4),
(1, 2, 3),
(1, 2, 4),
(1, 3, 4),
(2, 3, 4)]
Then create a new DataFrame containing the sum of all numeric fields in your original frame, over all the possible combinations of three rows:
combinations = pd.DataFrame([df.loc[c,:].sum() for c in comb_rows], index=comb_rows)
print(combinations)
Calories Protein IsBreakfast IsLunch IsDinner
(0, 1, 2) 600 60 2 1 3
(0, 1, 3) 700 70 2 1 2
(0, 1, 4) 800 80 2 2 3
(0, 2, 3) 800 80 1 1 2
(0, 2, 4) 900 90 1 2 3
(0, 3, 4) 1000 100 1 2 2
(1, 2, 3) 900 90 1 0 2
(1, 2, 4) 1000 100 1 1 3
(1, 3, 4) 1100 110 1 1 2
(2, 3, 4) 1200 120 0 1 2
Finally you can apply any filters you need:
filtered = combinations[
(combinations.IsBreakfast>0) &
(combinations.IsLunch>0) &
(combinations.IsDinner>0) &
(combinations.Calories>600) &
(combinations.Calories<1000) &
(combinations.Protein>=80) &
(combinations.Protein<120)
]
print(filtered)
Calories Protein IsBreakfast IsLunch IsDinner
(0, 1, 4) 800 80 2 2 3
(0, 2, 3) 800 80 1 1 2
(0, 2, 4) 900 90 1 2 3
Upvotes: 1
Reputation: 1199
You can add combinations of filters to a dataframe using the |
and &
operators.
Creating a dummy dataframe for example:
df1 = pd.DataFrame({"Calories": [100, 200, 300, 400, 500],
"Protein": [10, 20, 30, 40, 50],
"IsBreakfast": [1, 1, 0, 0, 0],
"IsLunch": [1, 0, 0, 0, 1],
"IsDinner": [1, 1, 1, 0, 1]})
print(df1)
Output:
Calories Protein IsBreakfast IsLunch IsDinner
0 100 10 1 1 1
1 200 20 1 0 1
2 300 30 0 0 1
3 400 40 0 0 0
4 500 50 0 1 1
Now add all the conditions:
min_cal = 100
max_cal = 600
min_prot = 10
max_prot = 40
df_filtered = df1[
((df1['IsBreakfast']==1) | (df1['IsLunch']==1) | (df1['IsDinner']==1)) &
((df1['Calories'] > min_cal) & (df1['Calories'] < max_cal)) &
((df1['Protein'] > min_prot) & (df1['Protein'] < max_prot))
]
print(df_filtered)
Output:
Calories Protein IsBreakfast IsLunch IsDinner
1 200 20 1 0 1
2 300 30 0 0 1
Upvotes: 0