Reputation: 3502
A Noob question,
I am going through documentation and found this sample example,
I could not understand the conditions for AAA,BBB,CCC
here in snippet
df:
AAA BBB CCC
0 4 2000 2000
1 5 555 555
2 6 555 555
3 7 555 555
then,
df_mask = pd.DataFrame({'AAA': [True] * 4,
...: 'BBB': [False] * 4,
...: 'CCC': [True, False] * 2})
...:
In [10]: df.where(df_mask, -1000)
Out[10]:
AAA BBB CCC
0 4 -1000 2000
1 5 -1000 -1000
2 6 -1000 555
3 7 -1000 -1000
May I know a bit of explaination for the above snippet?
Upvotes: 4
Views: 120
Reputation: 862731
You can check DataFrame.where
:
cond : boolean Series/DataFrame, array-like, or callable
Where cond is True, keep the original value. Where False, replace with corresponding value from other. If cond is callable, it is computed on the Series/DataFrame and should return boolean Series/DataFrame or array. The callable must not change input Series/DataFrame (though pandas doesn’t check it).other : scalar, Series/DataFrame, or callable
Entries where cond is False are replaced with corresponding value from other. If other is callable, it is computed on the Series/DataFrame and should return scalar or Series/DataFrame. The callable must not change input Series/DataFrame (though pandas doesn’t check it).
So it means it replace False
value of mask by other
, here -1000
.
Sample:
df = pd.DataFrame({'AAA': [4, 5, 6, 7], 'BBB': [4, 11, 0, 8], 'CCC': [2000, 45, 555, 85]})
print (df)
AAA BBB CCC
0 4 4 2000
1 5 11 45
2 6 0 555
3 7 8 85
df_mask = pd.DataFrame({'AAA': [True] * 4,
'BBB': [False] * 4,
'CCC': [True, False] * 2})
print (df.where(df_mask, -1000))
AAA BBB CCC
0 4 -1000 2000
1 5 -1000 -1000
2 6 -1000 555
3 7 -1000 -1000
If no values in other
there is replacement to NaN
s:
print (df.where(df_mask))
AAA BBB CCC
0 4 NaN 2000.0
1 5 NaN NaN
2 6 NaN 555.0
3 7 NaN NaN
You can also pass mask with compare values, e.g.:
print (df.where(df > 10, -1000))
AAA BBB CCC
0 -1000 -1000 2000
1 -1000 11 45
2 -1000 -1000 555
3 -1000 -1000 85
Upvotes: 1
Reputation: 8033
do print(df_mask)
You will get the dataframe as below
AAA BBB CCC
0 True False True
1 True False False
2 True False True
3 True False False
with df.where(df_mask, -1000)
, you are replacing False
values with -1000 with final out put as below
AAA BBB CCC
0 4 -1000 2000
1 5 -1000 -1000
2 6 -1000 555
3 7 -1000 -1000
Upvotes: 1