Reputation: 167
I have created a data frame with 3 columns, the third one contains lists, I want to drop rows that contains an empty list in that cell.
I have tried with
df[df.numbers == []] and df[df.numbers == null]
but nothing works.
name country numbers
Lewis Spain [1,4,6]
Nora UK []
Andrew UK [3,5]
The result will be a data frame without Nora's row
Upvotes: 9
Views: 4599
Reputation: 849
Let's say your data is set up like this:
import pandas as pd
df = pd.DataFrame([{'name': "Lewis", 'country': "Spain", "numbers": [1,4,6]},
{'name': "Nora", 'country': "UK", "numbers": []},
{'name': "Andrew", 'country': "UK", "numbers": [3,5]}])
You could iterate over the dataframe and add only the rows that don't have an empty numbers array to a new dataframe called "newDF". For example:
newDFArray = []
for index, row in df.iterrows():
emptyArrayCheck = row["numbers"]
if len(emptyArrayCheck) > 0:
newDFArray.append(row)
newDF = pd.DataFrame(newDFArray)
newDF
This will yield:
country name numbers
0 Spain Lewis [1, 4, 6]
2 UK Andrew [3, 5]
Upvotes: 3
Reputation: 491
One way to do it is to create a new column containing the length of df.numbers by:
df['len'] = df.apply(lambda row: len(row.numbers), axis=1)
and then filter by that column by doing:
df[df.len > 0]
Upvotes: 4
Reputation: 75080
Use series.str.len()
to check the length of elements in the list and then filter out where it equals 0:
df[~df.numbers.str.len().eq(0)]
name country numbers
0 Lewis Spain [1, 4, 6]
2 Andrew UK [3, 5]
Upvotes: 13
Reputation: 6543
Using the idea that the result of any list multiplied by 0 gives an empty list, one way to do this is:
In [29]: df[df.numbers != df.numbers * 0]
Out[29]:
name numbers country
0 Lewis [1, 4, 6] Spain
2 Andrew [3, 5] UK
Upvotes: 5