Parth Tiwari
Parth Tiwari

Reputation: 486

How to fill values in new column in dataframe on a multiple conditions

I am trying to fill column Final with values if Number contains some values and Code contains 0 then the value present in Code i.e 0 should be replaced by the value of the Number column, which I am able to do with:

df['Final'] = np.where(df['Code'] == 0, df['Number'], df['Code'])

but for rows no. 5, 6 and 7 I am facing a problem, the value should be in the Final column with least filled 00 values. How can this be achieved with pandas? The double zero (00) can come in consecutive manner only.

Upvotes: 1

Views: 245

Answers (2)

jezrael
jezrael

Reputation: 862481

You can compare values with Series.str.count and chain 3 conditions with | for bitwise OR and & for bitwise AND with numpy.where:

print (df)
   No    Number      Code     Final
0   1  78797071         0  78797071
1   2         0  89797071  89797071
2   3         0  57797074  57797074
3   4  39797571         0  39797571
4   5  62170000  62175268  62175268
5   6  52130000  52000000  52130000
6   7  52146700  52140000  52146700

a = df['Number'].astype(str).str.count('0')
b = df['Code'].astype(str).str.count('0')

#test if value is not 0
m1 = df['Code'] != 0
#test if value is 0
m2 = df['Number'] == 0
#test if number of 0 is higher
m3 = a > b

df['Final'] = np.where((m1 & m2) | m3, df['Code'], df['Number'])
print (df)

   No    Number      Code     Final
0   1  78797071         0  78797071
1   2         0  89797071  89797071
2   3         0  57797074  57797074
3   4  39797571         0  39797571
4   5  62170000  62175268  62175268
5   6  52130000  52000000  52130000
6   7  52146700  52140000  52146700

Detail:

print (df.assign(m1 = m1,
                 m2 = m2,
                 m3 = m3,
                 m1ANDm2 = m1 & m2,
                 mask= (m1 & m2) | m3))

   No    Number      Code     Final     m1     m2     m3  m1ANDm2   mask
0   1  78797071         0  78797071  False  False  False    False  False
1   2         0  89797071  89797071   True   True  False     True   True
2   3         0  57797074  57797074   True   True  False     True   True
3   4  39797571         0  39797571  False  False  False    False  False
4   5  62170000  62175268  62175268   True  False   True    False   True
5   6  52130000  52000000  52130000   True  False  False    False  False
6   7  52146700  52140000  52146700   True  False  False    False  False

Upvotes: 2

Erfan
Erfan

Reputation: 42886

We can count the amount of zero's in both columns and take that with us in np.where:

count1 = df['Number'].astype(str).str.count('0') 
count2 = df['Code'].replace(0, np.NaN).astype(str).str.count('0')

df['Final'] = np.where(df['Code']==0 | (count1<count2), df['Number'], df['Code'])

Output

   No    Number      Code     Final
0   1  78797071         0  78797071
1   2         0  89797071  89797071
2   3         0  57797074  57797074
3   4  39797571         0  39797571
4   5  62170000  62175268  62175268
5   6  52130000  52000000  52000000
6   7  52146700  52140000  52140000

Upvotes: 1

Related Questions