Reputation: 486
I am trying to fill column Final with values if Number contains some values and Code contains 0
then the value present in Code i.e 0
should be replaced by the value of the Number column, which I am able to do with:
df['Final'] = np.where(df['Code'] == 0, df['Number'], df['Code'])
but for rows no. 5, 6 and 7 I am facing a problem, the value should be in the Final column with least filled 00
values. How can this be achieved with pandas? The double zero (00
) can come in consecutive manner only.
Upvotes: 1
Views: 245
Reputation: 862481
You can compare values with Series.str.count
and chain 3 conditions with |
for bitwise OR
and &
for bitwise AND
with numpy.where
:
print (df)
No Number Code Final
0 1 78797071 0 78797071
1 2 0 89797071 89797071
2 3 0 57797074 57797074
3 4 39797571 0 39797571
4 5 62170000 62175268 62175268
5 6 52130000 52000000 52130000
6 7 52146700 52140000 52146700
a = df['Number'].astype(str).str.count('0')
b = df['Code'].astype(str).str.count('0')
#test if value is not 0
m1 = df['Code'] != 0
#test if value is 0
m2 = df['Number'] == 0
#test if number of 0 is higher
m3 = a > b
df['Final'] = np.where((m1 & m2) | m3, df['Code'], df['Number'])
print (df)
No Number Code Final
0 1 78797071 0 78797071
1 2 0 89797071 89797071
2 3 0 57797074 57797074
3 4 39797571 0 39797571
4 5 62170000 62175268 62175268
5 6 52130000 52000000 52130000
6 7 52146700 52140000 52146700
Detail:
print (df.assign(m1 = m1,
m2 = m2,
m3 = m3,
m1ANDm2 = m1 & m2,
mask= (m1 & m2) | m3))
No Number Code Final m1 m2 m3 m1ANDm2 mask
0 1 78797071 0 78797071 False False False False False
1 2 0 89797071 89797071 True True False True True
2 3 0 57797074 57797074 True True False True True
3 4 39797571 0 39797571 False False False False False
4 5 62170000 62175268 62175268 True False True False True
5 6 52130000 52000000 52130000 True False False False False
6 7 52146700 52140000 52146700 True False False False False
Upvotes: 2
Reputation: 42886
We can count
the amount of zero's in both columns and take that with us in np.where
:
count1 = df['Number'].astype(str).str.count('0')
count2 = df['Code'].replace(0, np.NaN).astype(str).str.count('0')
df['Final'] = np.where(df['Code']==0 | (count1<count2), df['Number'], df['Code'])
Output
No Number Code Final
0 1 78797071 0 78797071
1 2 0 89797071 89797071
2 3 0 57797074 57797074
3 4 39797571 0 39797571
4 5 62170000 62175268 62175268
5 6 52130000 52000000 52000000
6 7 52146700 52140000 52140000
Upvotes: 1