Reputation: 8304
I have the following data in a tab-separated file test.tsv
.
Class Length Frag
I 100 True
I 200 True
P 300 False
I 400 False
P 500 True
P 600 True
N 700 True
I have loaded the data into a pandas.DataFrame
object, and anywhere that Class = I and Frag = True I would like to set Class = F. The following code does not seem to be working. What am I doing wrong, and what should I be doing?
import pandas
data = pandas.read_table('test.tsv')
data.loc[(data.Class == 'I') & (data.Frag is True), 'Class'] = 'F'
Upvotes: 2
Views: 105
Reputation: 353189
In your line
data.loc[(data.Class == 'I') & (data.Frag is True), 'Class'] = 'F'
you shouldn't use is
. is
tests identity, not equality. So when you're asking if data.Frag is True
, it's comparing the Series object data.Frag
and asking whether it's the same object as True
, and that's not true. Really you want to use ==
, so you get a Series result:
>>> data.Frag is True
False
>>> data.Frag == True
0 True
1 True
2 False
3 False
4 True
5 True
6 True
Name: Frag, dtype: bool
But since we're working with a series of bools anyway, the == True
part doesn't add anything, and we can drop it:
>>> data.loc[(data.Class == 'I') & (data.Frag), 'Class'] = 'F'
>>> data
Class Length Frag
0 F 100 True
1 F 200 True
2 P 300 False
3 I 400 False
4 P 500 True
5 P 600 True
6 N 700 True
Upvotes: 3
Reputation: 90929
I think you can use .apply()
with axis=1
and a lambda expression for your condition/replacement . Example -
In [24]: df['Class'] = df.apply(lambda x: 'F' if x['Class'] == 'I' and x['Frag'] == True else x['Class'], axis=1)
In [25]: df
Out[25]:
Class Length Frag
0 F 100 True
1 F 200 True
2 P 300 False
3 I 400 False
4 P 500 True
5 P 600 True
6 N 700 True
Upvotes: 1