Reputation: 783
This seems like a fairly simple thing but I haven't been able to find an answer for it here (yet).
I have a list of dictionaries, and some of the dictionaries in the list have NaN values. I just need to drop any dictionary from the list if it has a NaN value in it.
I've tried it a few different ways myself. Here's one attempt with filter and a lambda function, which got a TypeError ("must be real number, not dict_values," which makes sense):
from math import isnan
def remove_dictionaries_missing_data(list_of_dictionaries):
return list(filter(lambda dictionary: not math.isnan(dictionary.values()), \
list_of_dictionaries))
I also tried it with a couple nested loops and some code I really wasn't sure about and got the same error:
from math import isnan
def remove_dictionaries_missing_data(list_of_dictionaries):
cleaned_list = []
for dictionary in list_of_dictionaries:
if not math.isnan(dictionary[value] for value in dictionary.values()):
cleaned_list.append(dictionary)
return cleaned_list
... and finally with just a list comprehension (same error):
from math import isnan
def remove_movies_missing_data(movies):
return [movie for movie in movies if not math.isnan(movie.values())]
EDIT:
Here's a sample of the list I'm working with:
[{'year': 2013,
'imdb': 'tt2005374',
'title': 'The Frozen Ground',
'test': 'nowomen-disagree',
'clean_test': 'nowomen',
'binary': 'FAIL',
'budget': 19200000,
'domgross': nan,
'intgross': nan,
'code': '2013FAIL',
'budget_2013$': 19200000,
'domgross_2013$': nan,
'intgross_2013$': nan,
'period code': 1.0,
'decade code': 1.0},
{'year': 2011,
'imdb': 'tt1422136',
'title': 'A Lonely Place to Die',
'test': 'ok',
'clean_test': 'ok',
'binary': 'PASS',
'budget': 4000000,
'domgross': nan,
'intgross': 442550.0,
'code': '2011PASS',
'budget_2013$': 4142763,
'domgross_2013$': nan,
'intgross_2013$': 458345.0,
'period code': 1.0,
'decade code': 1.0},
... ]
Upvotes: 1
Views: 189
Reputation: 781779
dictionary.values()
is a generator for all the values in the dictionary. You need to call math.isnan()
on the individual values. You can use any()
to do this:
def remove_dictionarries_missing_data(list_of_dictionaries):
return [d for d in list_of_dictionaries
if not any(isinstance(val, float) and math.isnan(val) for val in d.values())]
Upvotes: 3