Reputation: 64
I have a *.xlsx file as below -
A B C
[['Neutral']] ['nan']
[['Neutral']] ['nan']
Bad [['Negative']] ['Bad']
Meh [['Neutral']] ['Meh']
[['Neutral']] ['nan']
I'm trying to remove all the [['Neutral']] and ['nan']
values in column B
and C
ONLY if their value is null
.
This is my code -
df1 = pd.read_excel(path)
for i, row in df1.iterrows():
if pd.isnull(row[0]):
# del row[1]
# del row[2]
row[1] = 0
row[2] = 0
My code is finding all the null values perfectly but is not able to clear out the B
and C
column variable s.What am I doing wrong?
Expected output is-
A B C
Bad [['Negative']] ['Bad']
Meh [['Neutral']] ['Meh']
And yes the blank spaces/cells should still exist.
Upvotes: 1
Views: 1500
Reputation: 11192
try this,
mask=df['A'].isnull()
df.loc[mask]=''
Output:
A B C
0
1
2 Bad [['Negative']] ['Bad']
3 Meh [['Neutral']] ['Meh']
4
For this problem you don't need to use for loop in pandas,
Explanation
Find index of wherever A is blank
replace empty at selected index
EDIT:
To remove from specific columns,
df.loc[mask,['B','C']]=''
Upvotes: 2
Reputation: 4607
You can simply assign it by taking the indexes of null values
df.loc[df.A.isnull()] = ''
Out:
A B C
0
1
2 Bad [['Negative']] ['Bad']
3 Meh [['Neutral']] ['Meh']
4
Upvotes: 1