frazman
frazman

Reputation: 33293

Remove rows containing empty list of tuples in pandas

I have a dataframe like following

name     foo_list
'foo'    [('bleh'), ('blah')]
'bar'    [(), 'boo']
'foobar'  [(), (), ()]

I want to remove all the empty tuples and incase all the vals in list are empty tuples, just drop the row entirely. Also, I want to convert this list of tuples into list. So, output would be

name     foo_list
'foo'    ['bleh', 'blah']
'bar'    [ 'boo']

How do i do this in pandas?

Upvotes: 4

Views: 1254

Answers (2)

BENY
BENY

Reputation: 323356

Try this ?

Data Input:

df=pd.DataFrame({'name':['A','B','C'],'foo_list':[[('bleh'),('blah')], [(), 'boo'],[(), (), ()]]})

Solution:

df['foo_list']=df['foo_list'].apply(lambda x : [t for t in x if t != ()])
df.loc[df['foo_list'].apply(len)>0,:]

Out[20]: 
       foo_list name
0  [bleh, blah]    A
1         [boo]    B

Timing(small size)

%timeit df['foo_list'].apply(lambda x : [t for t in x if t != ()])#Wen
10000 loops, best of 3: 117 µs per loop

%timeit df.foo_list.apply(lambda x: filter(None, x)) # John
10000 loops, best of 3: 121 µs per loop

large size will recommend John's solution

df=pd.concat([df]*10000,0)

%timeit df.foo_list.apply(lambda x: filter(None, x))
100 loops, best of 3: 10.2 ms per loop
%timeit df['foo_list'].apply(lambda x : [t for t in x if t != ()])
100 loops, best of 3: 17.1 ms per loop

Upvotes: 4

Zero
Zero

Reputation: 77007

Use filter to remove empty tuples in lists.

In [679]: df['foo_list'] = df.foo_list.apply(lambda x: list(filter(None, x)))

Use str.len to remove empty lists

In [680]: df.loc[df.foo_list.str.len()>0]
Out[680]:
       foo_list name
0  [bleh, blah]    A
1         [boo]    B

Upvotes: 3

Related Questions