Reputation: 2024
I am trying to filter a dictionary and find the max from a list of values. Here a sample dictionary:
import numpy as np
def clean_currency(x):
""" If the value is a string, then remove currency symbol and delimiters
otherwise, the value is numeric and can be converted
"""
if isinstance(x, str):
return(x.replace('$', '').replace(',', ''))
return(x)
d = {
'shores': ['$0.00'],
'Broderick': ['$0.00', '$0.00', '$0.00'],
'100 Broderick': ['$0.00', '$1,142,070.00', '$0.00', '$0.00', '$0.00'],
'1001 Orange Grove': ['$0.00'],
'1008 Hyde': [np.nan]
}
# Keep the highest value from the list.
revd = {
k: max((e for e in v if pd.notna(e)), key=lambda x: float(clean_currency(x)) if isinstance(x, str) else x)
for k,v in d.items()
}
revd
I am getting ValueError: max() arg is an empty sequence
error.
Full traceback
ValueError Traceback (most recent call last)
/var/folders/d0/gnksqzwn2fn46fjgrkp6045c0000gn/T/ipykernel_49841/938463883.py in <module>
1 # Keep the highest value from the list.
2
----> 3 revd = {
4 k: max((e for e in v if pd.notna(e)), key=lambda x: float(clean_currency(x)) if isinstance(x, str) else x)
5 for k,v in d.items()
/var/folders/d0/gnksqzwn2fn46fjgrkp6045c0000gn/T/ipykernel_49841/938463883.py in <dictcomp>(.0)
2
3 revd = {
----> 4 k: max((e for e in v if pd.notna(e)), key=lambda x: float(clean_currency(x)) if isinstance(x, str) else x)
5 for k,v in d.items()
6 }
ValueError: max() arg is an empty sequence
Upvotes: 0
Views: 92
Reputation: 95948
The problem is you are filtering all the nan
s out, leaving you with an empty sequence in some case. But there is no need to filter nans, as far as I can tell. The reasonable response for the maximum of all nans is nan, which is already the behavior you'd get, so just use something like this:
from pprint import pprint
def clean_currency(x):
""" If the value is a string, then remove currency symbol and delimiters
otherwise, the value is numeric and can be converted
"""
if isinstance(x, str):
x = x.replace('$', '').replace(',', '')
return float(x)
d = {
'shores': ['$0.00'],
'Broderick': ['$0.00', '$0.00', '$0.00'],
'100 Broderick': ['$0.00', '$1,142,070.00', '$0.00', '$0.00', '$0.00'],
'1001 Orange Grove': ['$0.00'],
'1008 Hyde': [float('nan')]
}
# Keep the highest value from the list.
revd = {
k: max(v, key=clean_currency)
for k,v in d.items()
}
pprint(revd, sort_dicts=False)
which outputs:
{'shores': '$0.00',
'Broderick': '$0.00',
'100 Broderick': '$1,142,070.00',
'1001 Orange Grove': '$0.00',
'1008 Hyde': nan}
Upvotes: 1