Kusi
Kusi

Reputation: 805

How to remove NaN from list of lists with string entries?

I am trying to remove all the nans from a list of lists (with string entries) and my data is as follows:

[['beer', 'nuts', nan], 
['beer', 'butter', 'apple'], 
['beer', 'nuts', 'cheese'], 
['beer', 'bananas', nan], 
['beer', 'nuts', 'apple']]

I would like to get this result:

[['beer', 'nuts'], 
['beer', 'butter', 'apple'], 
['beer', 'nuts', 'cheese'], 
['beer', 'bananas'], 
['beer', 'nuts', 'apple']]

I've tried doing the answers from (How to remove nan's from list of lists? [duplicate] and How to delete [NaN] from a list of lists?) namely:

import math
nan = float('nan')

store_data_list = [[x for x in y if not math.isnan(x)] for y in store_data_list] #remove nans from list of lists

#AND

store_data_list = [xs for xs in store_data_list if not any(math.isnan(x) for x in xs)]

#AND

store_data_list = [[x for x in y if not np.isnan(x)] for y in store_data_list]

Both seem not to work in my instance. I get the errors:

TypeError: must be real number, not str

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Can someone please indicate what I am doing wrong

Upvotes: 2

Views: 1697

Answers (3)

Rahul Singh
Rahul Singh

Reputation: 184

You can try this :

import numpy as np
import pandas as pd
my_list = pd.Series(['a','b', np.NaN, 'c'])
my_new_list = ["Sample_text" if pd.isnull(element) else element for element in my_list]
my_new_list

OutPut :

['a', 'b', 'Sample_text', 'c']

Upvotes: 1

Djaouad
Djaouad

Reputation: 22776

math.isnan takes a float as argument, not a str, you could do this check before using math.isnan:

store_data_list = [[x for x in y if type(x) != float or not math.isnan(x)] for y in store_data_list]

print(store_data_list)

Output:

[['beer', 'nuts'],
 ['beer', 'butter', 'apple'],
 ['beer', 'nuts', 'cheese'],
 ['beer', 'bananas'],
 ['beer', 'nuts', 'apple']]

Upvotes: 2

tomjn
tomjn

Reputation: 5389

One option is to compare the item with itself (which is false for nan)

nan = float('nan')
data = [['beer', 'nuts', nan], 
        ['beer', 'butter', 'apple'], 
        ['beer', 'nuts', 'cheese'], 
        ['beer', 'bananas', nan], 
        ['beer', 'nuts', 'apple']]
[[i for i in j if i == i] for j in data]

gives

[['beer', 'nuts'],
 ['beer', 'butter', 'apple'],
 ['beer', 'nuts', 'cheese'],
 ['beer', 'bananas'],
 ['beer', 'nuts', 'apple']]

Upvotes: 5

Related Questions