Reputation: 51
I have a dataframe which has four columns that is range,weather,flag and calculation.I need to take the combinations from the three list for three columns (range,weather and flag) and check if the combination is not present for these three columns then add new row in the dataframe.
range weather flag calculation
0-5 good y 12
5-6 good n 14
0-5 bad n 2
5-6 worse y 5
The output is as follows:
range weather flag calculation
0-5 good y 12
0-5 bad n 2
0-5 good n null
0-5 worse n null
0-5 bad y null
0-5 worse y null
5-6 good n 14
5-6 worse y 5
5-6 bad n null
5-6 worse n null
5-6 bad y null
5-6 good y null
The code which I tried is as follows:
r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']
for i in r:
for j in w:
for k in f:
if i in data1['range'].values and j in data1['weather'].values and k in data1['flag'].values:
print(i,j,k)
print("yes")
else:
print(i,j,k)
print("no")
data1=data1.append([{'bl_flag':j},{'weather_status':k}], ignore_index=True)
print(data1)
The above code does not check all 3 combinations are present in one single row, if it is not present in one single row then it has to be appended in the dataframe.
Upvotes: 1
Views: 101
Reputation: 66
r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']
for i in r:
for j in f:
for k in w:
count=data1[data1["range"]==i].groupby(["range","weather","flag"]).apply(lambda x: x[(x["flag"]==j)&(x["weather"]==k).any()])
if count.size==0:
data1=data1.append({'flag':j,'weather':k}, ignore_index=True)
Upvotes: 1
Reputation: 21
One approach to this problem is to create a DataFrame with all possible combinations of values from "range", "weather" and "flag"
columns and then merge the new DataFrame with original DataFrame by using outer join
.
To create dataframe with all possible combinations:
r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']
res = [[i, j, k] for i in r
for j in w
for k in f]
cls = ["range","weather","flag"]
df1 = pd.DataFrame(res,columns = cls)
df1
Output:
range weather flag
0 0-5 good n
1 0-5 good y
2 0-5 bad n
3 0-5 bad y
4 0-5 worse n
5 0-5 worse y
6 5-6 good n
7 5-6 good y
8 5-6 bad n
9 5-6 bad y
10 5-6 worse n
11 5-6 worse y
Now, you can merge this DataFrame with original DataFrame using outer
join in the following way:
new_df = pd.merge(df1, orignal_df, how='outer', left_on=cls, right_on = cls)
Output:
range weather flag calculation
0 0-5 good n NaN
1 0-5 good y NaN
2 0-5 bad n NaN
3 0-5 bad y NaN
4 0-5 worse n NaN
5 0-5 worse y NaN
6 5-6 good n NaN
7 5-6 good y NaN
8 5-6 bad n NaN
9 5-6 bad y NaN
10 5-6 worse n NaN
11 5-6 worse y NaN
12 0-5 good y 12.0
13 5-6 good n 14.0
14 0-5 bad n 2.0
15 5-6 worse y 5.0
Upvotes: 1