Reputation: 9501
I have a data frame that has a column 'Fruit' with something like this in it.
[u'']
[u'']
[u'']
[u'' u'apple' u'Orange']
[u'']
[u'']
I want to return only the items that have the [u''].
I have tried this using bool types and str len but there are some other places I might have an something like this
[u'apple']
df1 = d[d['Fruit'].str.len()== 0]
Returns nothing because it is counting it as 1.
Upvotes: 0
Views: 59
Reputation: 749
The way you've represented your data frame is a bit weird. Assuming that every entry within the Fruit
column is in fact a list, the length of this is going to be 1 because each record is a list with one entry in it (at least in the data you've provided).
The [u'']
records you're interested in are a list consisting of a simple empty string. The u
character you see in front of the string isn't part of the string but just denotes that the string is unicode, see this question for more information.
To solve your problem, you should be able to do
df1 = d[d['Fruit'] == ['']]
to pull back only rows with empty strings.
If you're still unclear what's going on, play around with this:
test = [u'']
test == ['']
>> True
test == ['', '']
>> False
Upvotes: 1