arpita
arpita

Reputation: 51

Adding new rows in the existing columns based on condition

I have a dataframe which has four columns that is range,weather,flag and calculation.I need to take the combinations from the three list for three columns (range,weather and flag) and check if the combination is not present for these three columns then add new row in the dataframe.

range   weather  flag   calculation
 0-5    good      y      12
 5-6    good      n      14
 0-5    bad       n      2
 5-6    worse     y      5

The output is as follows:

range   weather  flag   calculation
 0-5    good      y       12
 0-5    bad       n        2
 0-5    good      n       null
 0-5    worse     n       null
 0-5    bad       y       null
 0-5    worse     y       null
 5-6    good      n       14
 5-6    worse     y        5
 5-6    bad       n       null
 5-6    worse     n       null
 5-6    bad       y       null
 5-6    good      y       null

The code which I tried is as follows:

r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']
for i in r:
    for j in w:
        for k in f:
            if i in data1['range'].values and j in data1['weather'].values and k in data1['flag'].values:
            print(i,j,k)
            print("yes")      
        else:
            print(i,j,k)
            print("no")
            data1=data1.append([{'bl_flag':j},{'weather_status':k}], ignore_index=True)
        print(data1)

The above code does not check all 3 combinations are present in one single row, if it is not present in one single row then it has to be appended in the dataframe.

Upvotes: 1

Views: 101

Answers (2)

pallavi bhalerao
pallavi bhalerao

Reputation: 66

r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']
for i in r:
    for j in f:
        for k in w:          
            count=data1[data1["range"]==i].groupby(["range","weather","flag"]).apply(lambda x: x[(x["flag"]==j)&(x["weather"]==k).any()])              
            if count.size==0:
                data1=data1.append({'flag':j,'weather':k}, ignore_index=True)

Upvotes: 1

Pralay Ramteke
Pralay Ramteke

Reputation: 21

One approach to this problem is to create a DataFrame with all possible combinations of values from "range", "weather" and "flag" columns and then merge the new DataFrame with original DataFrame by using outer join.

To create dataframe with all possible combinations:

r=['0-5','5-6']
w=['good','bad','worse']
f=['n','y']

res = [[i, j, k] for i in r  
                 for j in w 
                 for k in f] 

cls = ["range","weather","flag"]

df1 = pd.DataFrame(res,columns  =  cls)
df1

Output:

   range weather flag
0    0-5    good    n
1    0-5    good    y
2    0-5     bad    n
3    0-5     bad    y
4    0-5   worse    n
5    0-5   worse    y
6    5-6    good    n
7    5-6    good    y
8    5-6     bad    n
9    5-6     bad    y
10   5-6   worse    n
11   5-6   worse    y

Now, you can merge this DataFrame with original DataFrame using outer join in the following way:

new_df = pd.merge(df1, orignal_df,  how='outer', left_on=cls, right_on = cls)

Output:

   range weather flag  calculation
0    0-5    good    n          NaN
1    0-5    good    y          NaN
2    0-5     bad    n          NaN
3    0-5     bad    y          NaN
4    0-5   worse    n          NaN
5    0-5   worse    y          NaN
6    5-6    good    n          NaN
7    5-6    good    y          NaN
8    5-6     bad    n          NaN
9    5-6     bad    y          NaN
10   5-6   worse    n          NaN
11   5-6   worse    y          NaN
12   0-5    good    y         12.0
13   5-6    good    n         14.0
14   0-5     bad    n          2.0
15   5-6   worse    y          5.0

Upvotes: 1

Related Questions