reality stilts
reality stilts

Reputation: 95

Pandas, isin, column of lists

Trying to make a Boolean flag that reads TRUE if one value or another is within a list. The below code is returning a FALSE for row 1 and I am not sure why, could someone help me understand why a FALSE is getting returned for the first row?

lists={'someList!':[[1,2,12,6,'ABC'],[1000,4,'z','a','bob']]}
dfLists = pd.DataFrame(lists)
dfLists['contains?']=dfLists['someList!'].isin([0,1])

Upvotes: 1

Views: 1542

Answers (3)

Cobra
Cobra

Reputation: 632

You are passing the dataframe a list of lists. It's comparing integers to lists, so it's not finding a match.

Define your columns distinctly, like this, and isin should work.

lists={'someList!':[1,2,12,6,'ABC'], 'someList2':[1000,4,'z','a','bob']}

Upvotes: 0

BENY
BENY

Reputation: 323396

Using Dataframe constructor flatten you list column, then using isin

pd.DataFrame(dfLists['someList!'].tolist()).isin([1,2]).any(1)
Out[39]: 
0     True
1    False
dtype: bool

Upvotes: 0

Brad Solomon
Brad Solomon

Reputation: 40948

Could someone help me understand why a FALSE is getting returned for the first row?

This isn't working because .isin(values) returns whether each element in the Series is contained in values.

You can use {0, 1} as a set and apply the truthiness of its intersection to each list:

>>> s = {0, 1}
>>> dfLists['someList!'].apply(lambda x: bool(s.intersection(x)))
0     True
1    False

This effectively does:

>>> s.intersection([1, 2, 12, 6, 'ABC'])
{1}
>>> s.intersection([1000, 4, 'z', 'a', 'bob'])
set()

The bool of the first result is True, because it is non-empty.

Upvotes: 1

Related Questions