Reputation: 18638
I get the following error when I try to filter a Dataframe by using a string
TypeError: Could not compare <type 'str'> type with Series
This is my code;
data = pd.read_csv('data.csv')
fildata = data[(data['cat1'] == 'FALSE') & (data['cat2'] != '') & (data['cat3'] == 'FALSE')]
EDIT 1:
Here's how the data looks like;
count,word,cat1,cat2,cat3
1021,.,FALSE,,FALSE
825,the,TRUE,the,FALSE
693,and,TRUE,and,FALSE
647,of,TRUE,of,FALSE
646,",",FALSE,,FALSE
435,to,TRUE,to,FALSE
353,will,TRUE,will,FALSE
297,in,TRUE,in,FALSE
274,be,TRUE,be,FALSE
EDIT 2:
But why does this work?
data1 = pd.DataFrame({'cat1':[1,2,3,4],'cat2':[2,3,1,4],'cat3':[3,1,2,4]})
fildata = data1[(data1['cat1'] == 1) & (data1['cat2'] != 0) & (data1['cat3']== 3)]
This results in;
cat1 cat2 cat3
0 1 2 3
EDIT 3:
I guess the problem is with the type. 'cat1' & 'cat2' are of the type 'bool'
Upvotes: 2
Views: 3178
Reputation: 394021
The following worked for me:
In [114]:
temp = """count,word,cat1,cat2,cat3
1021,.,FALSE,,FALSE
825,the,TRUE,the,FALSE
693,and,TRUE,and,FALSE
647,of,TRUE,of,FALSE
646,",",FALSE,,FALSE
435,to,TRUE,to,FALSE
353,will,TRUE,will,FALSE
297,in,TRUE,in,FALSE
274,be,TRUE,be,FALSE"""
data = pd.read_csv(io.StringIO(temp))
fildata = data[(data['cat1'] == False) & (data['cat2'].isnull() ) & (data['cat3'] == False)]
In [115]:
fildata
Out[115]:
count word cat1 cat2 cat3
0 1021 . False NaN False
4 646 , False NaN False
[2 rows x 5 columns]
The problem you have is that the string FALSE
/TRUE
are boolean dtypes as interpreted by read_csv
:
In [112]:
data.dtypes
Out[112]:
count int64
word object
cat1 bool
cat2 object
cat3 bool
dtype: object
so your comparison should be against this type and not the string 'FALSE'
Upvotes: 3